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FOUR HUMAN ZINC-FINGER- CONTAINING PROTEINS : 
MDZ3, MDZ4 , MDZ7 AND MDZ12 



REFERENCE TO SEQUENCE LISTING SUBMITTED ON COMPACT DISC 

The present application includes a Sequence 
5 Listing filed on one CD-R disc, provided in duplicate, 
containing a single file named sequence.txt, having 
1,048,233 bytes, last modified on August 2, 2001 and 
recorded August 2, 2001. The Sequence Listing contained in 
said file on said disc is incorporated herein by reference 
10 in its entirety. 

FIELD OF THE INVENTION 

The present invention relates to four novel human 
zinc-finger-containing proteins: MDZ3 , MDZ4 , MDZ7 and MDZ12, 
and a splice variant of MDZ12, MDZ12b (the original MDZ12 is 
15 here named MDZ12a for this reason) . More specifically, the 
invention provides isolated nucleic acid molecules encoding 
MDZ3, MDZ4, MDZ7, MDZ12 , fragments thereof, vectors and host 
cells comprising isolated nucleic acid molecules encoding 
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MDZ3, MDZ4, MDZ7 and MDZ12 ; MDZ3, MDZ4, MDZ7 and MDZ12 
polypeptides, antibodies, transgenic cells and non-human 
organisms, and diagnostic, therapeutic, and investigational 
methods of using the same. 

5 

BACKGROUND OF THE INVENTION 

Zinc-finger ( ZNF) -containing genes collectively 
represent the largest family of sequence-specific nucleic 

10 acid binding regulatory proteins in mammalian genomes, with 
600-1000 family members estimated for the human genome 
(Hoovers et al . , Genomics 12:254-263 (1992)). Most ZNF- 
containing proteins are thought to act as DNA-binding 
regulators of transcription, either activating or repressing 

15 expression of specific genes (El-Baradi and Pieler, Mech, 

Dev. 35:155-169 (1991)), However, a few family members have 
been found to bind specifically to RNA (e.g. ZNF74; Grondin 
et al. f J. Biol. Chem. 271:15458-15467 (1996)). 

The most common form of the ZNF motif, termed 

20 Kruppel or C2H2-type, was first described as a repeated 

domain in the Xenopus transcription factor TFIIIA (Miller et 
al., EMBO J. 4:1609-1614 ( 1985), but is named after a 
similar domain found in the Drosophila Kruppel protein, 
which also functions as a transcription factor (Rosenberg et 

25 al. t Nature 319:336-339 (1986)). Kruppel (C2H2)-type ZNF 
motifs share a common backbone polypeptide sequence of 
CX2CX3FX5LX2HX3H, where x can be any amino acid. ZNF motifs 
may be found dispersed throughout a polypeptide or may be 
clustered into one or more tandemly repeated blocks. In 

30 cases where the repeats are arranged in tandem, adjacent 

motifs are typically joined by a highly conserved stretch of 
seven amino acids (TGEKPYX, with X being any amino acid) 
termed the H-C link (Schuh et al., Cell 47:1025-1032 
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(1986)). The conserved cysteine (C) and histidine (H) 
residues collectively coordinate a single zinc ion, with the 
intervening amino acids forming an alpha helical structure 
that is responsible for specific binding to a nucleotide 
5 triplet. Among proteins with multiple, clustered ZNF 
motifs, each repeated unit binds to a different, but 
consecutive, nucleotide triplet (Suzuki et al., Nucl . Acids 
Res. 22:3397-3405 (1994)). Thus, for proteins with 
multiple ZNF repeats, it is the concerted activity of most 

10 or all of the repeat units that determines the proteins' DNA 
binding properties and ultimate biological function. 

Another hallmark of ZNF proteins is that they 
frequently contain additional functional domains that 
further define their biological activities. One motif that 

15 is commonly found in conjunction with ZNF motifs in 

mammalian proteins is the Kruppel-associated box (KRAB) 
motif, first described by Belief roid et al. Proc. Natl. 
Acad. Sci. USA 88:3608-3612 (1991). This motif can be 
divided into two sub-elements referred to as KRAB A and B 

2 0 domains. However, not all proteins contain both A and B 
domains (Bellefroid et al. f EMBO J. 12:1363-1374 (1993)). 
Several studies have demonstrated that the KRAB A domain 
acts as a repressor of transcription (Margolin et al . , Proc. 
Natl. Acad. Sci. USA 91:4509-4513 (1994); Witzgall et al., 

25 Proc. Natl. Acad. Sci. USA 91:4515-4518 (1994); Pengue et 
al., Nucl. Acids Res. 22:2908-2914 (1994)). The KRAB B 
domain by itself does not convey strong transcriptional 
repressor function, but may enhance the activity of an 
associated KRAB A domain (Margolin et al., Proc. Natl. Acad. 

30 Sci. USA 91:4509-4513 (1994)). A general mechanism for 
transcriptional repression appears to be shared by all 
members of the KRAB subfamily of ZNF proteins, and involves 
the recruitment of the common co-repressor protein 
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KAPl/KRIP-l/TIF-beta by KRAB-ZNF proteins to sites of gene 
repression (Friedman et al., Genes Devel. 10:2067-2078 
(1996); Kim et al . , Proc. Natl. Acad. Sci. USA 93:15299- 
15304 (1996); Moosmann et al., Nucl . Acids Res. 24:4849-4867 
5 (1996) ) . 

A second conserved motif that is often found in 
Kruppel (C2H2)-type ZNF proteins is called the SCAN box or 
LeR (leucine-rich) domain {Yokoyama et al . , Biochlm. 
Blophys. Acta. 1353:13-17 (1997); Williams et al . , Mol. Cell 

10 Biol. 19:8526-8535 (1999)). The SCAN box functions 

primarily as an oligomerization domain that allows self- 
association of proteins or association between different 
proteins containing compatible SCAN boxes (Williams et al., 
Mol. Cell Biol. 19:8526-8535 (1999)). Typically, the SCAN 

15 box is located at the N-terminus of the protein while the 
ZNF motifs are clustered near its C-terminus. In contrast 
to the KRAB domain, the SCAN box does not have a direct 
activator or repressor role in specific gene transcription. 

ZNF proteins have been implicated in a wide range 

20 of biological activities that includes regulation of cell 
proliferation and differentiation as well as controlling 
patterns of embryonic development. For example, mouse 
Krox20 is a ZNF protein that is expressed specifically in 
the hindbrain, and disruption of the gene by mutation 

25 results in abnormalities in brain development and death 
shortly after birth (Swiatek and Gridley, Genes Devel, 
7:2071-2084 (1993)). Another ZNF protein, human SALL1 has 
been shown to be the gene responsible for Townes-Brocks 
syndrome, an autosomal-dominant disorder characterized by 

30 abnormalities in the development of ears, limbs and kidneys 
(Kohlhase et al., Nature Genet. 18:81-83 (1998)). 

ZNF proteins have also been implicated in 
tumorigenesis in humans. For example, the product of the 
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Wilms' Tumor gene (WT1) , Kruppel-type ZNF protein, is 
normally expressed during kidney development. However, 
mutations that affect the DNA binding ZNF motifs can result 
in embryonal renal neoplasia (Call et al., Cell 60:509-520 
5 (1990)). Mutations in other ZNF genes have also been shown 
to be the causative agents in a number of other cancers, 
including acute promyelocytic leukemia (Chen et al . , EMBO J. 
12:1161-1167 (1993); Shaknovich et al . , Mol. Cell Biol. 
18:5533-5545 (1998)) and t(8;13) leukemia/ lymphoma syndrome 

10 (Xiao et al., Nature Genet. 18:84-87 (1998)). 

Recent reports suggest that at least one-third, 
and likely a higher percentage, of human genes are 
alternatively spliced. Hanke et al., Trends Genet. 
15(1): 389 - 390 (1999); Mironov et al . , Genome Res. 9:1288- 

15 93 (1999); Brett et al . , FEBS Lett. 474(1) :83-6 (2000). 
Alternative splicing has been proposed to account for at 
least part of the difference between the number of genes 
recently called from the completed human genome draft 
sequence — 30,000 to 40,000 (Genome International Sequencing 

20 Consortium, Nature 409:860-921 (15 February 2001) - and 
earlier predictions of human gene number that routinely 
ranged as high as 120,000, Liang et al., Nature Genet. 
25 (2) : 239-240 (2000). With the Drosophila homolog of one 
human gene reported to have 38,000 potential alternatively 

25 spliced variants, Schmucker et al., Cell 101:671 (2000), it 
now appears that alternative splicing may permit the 
relatively small number of human coding regions to encode 
millions, perhaps tens of millions, of structurally distinct 
proteins and protein isoforms. 

30 Given the demonstrated roles of ZNF proteins in 

cell proliferation, differentiation and development, in 
combination with tumorigenesis when expressed aberrantly, 
there is a need to identify and to characterize human genes 
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that encode ZNF-containing proteins. Given the importance 
of alternative splicing in providing further tissue-specific 
and developmental regulation of human proteins, there is a 
need to identify and to characterize splice variants of ZNF- 
5 containing proteins. 

SUMMARY OF THE INVENTION 

10 The present invention solves these and other needs 

in the art by providing isolated nucleic acids that encode 
four novel human zinc-finger-containing proteins MDZ3, 
MDZ4, MDZ7 and MDZ12a — as well as a splice variant of 
MDZ12a, termed MDZ12b, and fragments thereof. 

15 MDZ3 encodes a protein with a SCAN box and a more 

divergent KRAB domain, in addition to 7 ZNF motifs. MDZ4 
encodes a protein with SCAN box domain in addition to 5 ZNF 
motifs. MDZ7 encodes a protein that contains 7 ZNF motifs 
and no additional known elements. MDZ12 encodes a protein 

20 with a divergent and possibly partial form of the KRAB motif 
(containing only the B domain) along with 12 ZNF repeats. 

We have also isolated a splice variant of MDZ12, 
designated MDZ12b (and have thus denominated the originally 
isolated transcript MDZ12a) , which is expressed in some of 

25 the tissues tested. MDZ12b may encode a protein with 12 ZNF 
repeats (termed MDZ12bL) if the internal initiation 
methionine is used for protein translation, otherwise it 
encodes a 44 amino acid peptide (termed MDZ12bS) due to the 
introduction of a stop codon in the inserted exon. 

30 In other aspects, the invention provides vectors 

for propagating and expressing the nucleic acids of the 
present invention, host cells comprising the nucleic acids 
and vectors of the present invention, proteins, protein 
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fragments, and protein fusions of any of the four novel 
human zinc-Finger-containing proteins: MDZ3, MDZ4, MDZ7, 
MDZ12a and MDZ12b (S and L), and antibodies thereto. 

The invention further provides pharmaceutical 
5 formulations of the nucleic acids, proteins, and antibodies 
of the present invention. 

In other aspects, the invention provides 
transgenic cells and non-human organisms comprising nucleic 
acids of any one or more of four novel human zinc-finger- 
10 containing genes — MDZ3, MDZ4, MDZ7, MDZ12a and MDZ12b — 
and transgenic cells and non-human organisms with targeted 
disruption of the endogenous orthologue of any of the four 
novel human Zinc-Finger-containing genes, MDZ3, MDZ4, MDZ7, 
MDZ12a and MDZ12b. 
15 The invention additionally provides diagnostic, 

investigational, and therapeutic methods based on the 
nucleic acids, proteins, antibodies, mimetics, agonists, and 
antagonists of the present invention. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

The above and other objects and advantages of the 
present invention will be apparent upon consideration of the 
following detailed description taken in conjunction with the 
25 accompanying drawings, in which like characters refer to 
like parts throughout, and in which: 

FIGS. 1A - 1C schematize the protein domain 
structure of MDZ3, including the overall structure of MDZ3 
and the alignment of SCAN box and KRAB domain in MDZ3 with 
30 similar motifs; 

FIG. 2 is a map showing the genomic structure of 
MDZ3 encoded at chromosome 7q22.1; 



FIG. 3 presents the nucleotide and predicted amino 
acid sequences of MDZ3; 

FIGS. 4A and 4B schematize the protein domain 
structure of MDZ4, including the overall structure of MDZ4 
and the alignment of the SCAN box in MDZ4 with similar 
motifs; 

FIG. 5 is a map showing the genomic structure of 
MDZ4 encoded at chromosome 6p21.3-22.2; 

FIG. 6 presents the nucleotide and predicted amino 
acid sequences of MDZ4; 

FIG. 7 schematizes the protein domain structure of 

MDZ 7 ; 

FIG. 8 is a map showing the genomic structure of 
MDZ 7 encoded at chromosome 16pll.2; 

FIG. 9 presents the nucleotide and predicted amino 
acid sequences of MDZ7 ; 

FIGS. 10A and 10B schematize the protein domain 
structure of MDZ12, including the overall structure of 
MDZ12a and MDZ12bL (the longer ORF generated using the 
internal translation initiation site) and the alignment of 
the KRAB domain in MDZ12a with similar motifs; 

FIG. 11 is a map showing the genomic structure of 
MDZ12a and MDZ12b encoded at chromosome 15q26.1; 

FIG. 12 presents the nucleotide and predicted 
amino acid sequences of MDZ12a; 

FIG. 13 presents the nucleotide and predicted 
amino acid sequences of MDZ12b; 

FIG. 14 presents the RT-PCR analysis of the MDZ 3 
gene expression; 

FIG. 15 presents the RT-PCR analysis of the MDZ 7 
gene expression; and 

FIG. 16 presents the RT-PCR analysis of the MDZ12 
gene expression. 
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DETAILED DESCRIPTION OF THE INVENTION 

Mining the sequence of the human genome for novel 
5 human genes, the present inventors have identified four 

novel human zinc-finger-containing genes: MDZ3, MDZ4, MDZ7 
and MDZ12 (a and b) . Each of the four genes acts as a 
sequence-specific nucleic acid binding regulatory protein, 
and plays a role in cell proliferation, differentiation and 
10 development. When expressed aberrantly, the genes 
contribute to neoplasia. 

Detailed Description Of The MDZ3 Gene 

15 As schematized in FIG. 1, the protein product of 

the newly isolated MDZ3 gene shares certain protein domains 
and an overall structural organization with a family of 
other zinc-finger proteins. The shared structural features 
strongly imply that MDZ3 plays a role similar to that of the 

20 SCAN box- and KRAB motif-containing Kruppel family zinc- 
finger proteins as a sequence-specific nucleic acid binding 
regulatory protein, and is likely to participate in protein- 
protein interactions with other transcription modulators. 
Thus, MDZ3 is a clinically useful diagnostic marker and 

25 potential therapeutic agent for a variety of diseases, 
including developmental disorders and cancer. 

In common with the other family members of the 
SCAN box containing Kruppel zinc-finger proteins, MDZ3 has a 
SCAN box near the N-terminus, which has been shown to 

30 participate in protein-protein interactions. The C-terminal 
region of the MDZ3 protein contains seven copies of C2H2 
zinc fingers. There is also weak homology to KRAB domain in 
the middle of MDZ3. 
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FIG . 2 shows the genomic organization of MDZ3. 

At the top is shown approximately 14 kb of the 177 
kb bacterial artificial chromosome (BAC) , with GenBank 
accession number, that spans the MDZ3 locus. 
5 As shown in FIG. 2, MDZ3, encoding a protein of 

544 amino acids, comprises exons 1-8. Predicted molecular 
weight of the protein, prior to any post-translational 
modification, is 61.4 kD. 

As further discussed in the examples herein, 
10 expression of MDZ3 was assessed using RT-PCR analysis. RT- 
PCR product for MDZ3 was clearly produced from brain, 
testis, heart and bone marrow, but not from lung, liver, or 
skeletal muscle. 

15 Detailed Description Of The MDZ4 Gene 

As schematized in FIG. 4, the product of the newly 
isolated MDZ4 gene shares certain protein domains and an 
overall structural organization with a family of zinc finger 

20 proteins. The shared structural features strongly imply that 
MDZ4 plays a role similar to those SCAN box-containing 
Kruppel family zinc-finger proteins as a potential 
transcription regulator, and is likely to participate in 
protein-protein interactions with other transcription 

25 modulators. Thus, MDZ4 is a clinically useful diagnostic 
marker and potential therapeutic agent for a variety of 
diseases, including developmental disorders and cancer. 

In common with the other family members of the 
SCAN box-containing Kruppel zinc-finger proteins, MDZ4 has a 

3 0 SCAN box near the N- terminus, which has been shown to 

participate in protein-protein interactions. The C-terminal 
region of the MDZ4 protein contains five copies of C2H2 zinc 
fingers . 
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FIG. 5 shows the genomic organization of MDZ4. 

At the top is shown about 9 kb of the 12 8 kb PI 
artificial chromosome (PAC) (with GenBank accession number) , 
that spans the MDZ4 locus. The genome-derived single-exon 
5 probe first used to demonstrate expression from this locus, 
is shown below the PAC and is labeled "500"; the 500 bp 
probe includes sequence drawn from exon two as well as 
flanking intron two. 

As shown in FIG. 5, MDZ4, encoding a protein of 
10 389 amino acids, is comprised of exons 1-4. Predicted 
molecular weight, prior to any post-translational 
modification, is 44.9 kD. 

As further discussed in the examples herein, 
expression of MDZ4 was assessed using hybridization to 
15 genome-derived single exon microarrays. Microarray analysis 
of exons 2 and 3 showed expression in all tissues tested, 
including bone marrow, brain, heart, hela, adult liver, 
fetal liver, lung, placenta and prostate. 

2 0 Detailed Description Of The MDZ7 Gene 

As schematized in FIG. 7, the newly isolated MDZ7 
gene product is mainly composed of seven tandemly arrayed 
Kruppel-type (C2H2) zinc finger repeats. Such a structure 

25 implies that MDZ7 is likely to function in sequence-specific 
DNA binding and impart a regulatory effect on specific gene 
expression. Thus, MDZ7 is a clinically useful diagnostic 
marker and potential therapeutic agent for a variety of 
diseases, including developmental disorders and cancer. 

30 FIG. 8 shows the genomic organization of MDZ7 . 

At the top is shown about 5.5 kb of the 121 kb 
bacterial artificial chromosome (BAC) , with GenBank 
accession number, that spans the MDZ7 locus. 
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As shown in FIG. 8, MDZ7 is comprised of four 
exons and encodes a protein of 248 amino acids. Predicted 
molecular weight of the MDZ7 protein, prior to any post- 
translational modification, is 72.0 kD. 
5 As further discussed in the examples herein, 

expression of MDZ7 was assessed using RT-PCR. RT-PCR 
analysis of MDZ7 showed expression only in testes, but not 
in brain, lung, liver, kidney, keletal muscle, heart, whole 
fetus, or Hela cells. 

10 

Detailed Description Of The MDZ12 Gene 



As schematized in FIG. 10A, and further shwon in 
FIG. 11, the newly isolated MDZ12 encodes a MDZ12a isoform 

15 and potentially two other isoforms, which we designate 12bL 
(for "long"), and 12bS (for "short") ; FIG. 10A shows the 
12bL isoform. MDZ12a contains a partial KRAB motif as well 
as twelve C2H2 zinc fingers* MDZ12bL encodes a protein with 
12 C2H2 zinc fingers. Such features strongly imply that 

20 MDZ12a (and MDZ12bL, if translated) plays a role as a 
potential transcription regulator, and is likely to 
participate in protein-protein interactions with other 
transcription modulators. Thus MDZ12 is a clinically useful 
diagnostic marker and potential therapeutic agent for a 

25 variety of diseases, including developmental disorders and 
cancer . 

FIG. 11 shows the genomic organization of MDZ12. 
At the top is shown a portion of the 173 kb 
bacterial artificial chromosome (BAC) , with GenBank 
30 accession number, that spans the MDZ12 locus. The genome- 
derived single-exon probe first used to demonstrate 
expression from this locus includes sequence drawn solely 
from exon 4 of MDZ12a. 
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As shown in FIG, 11, MDZ12a encodes a protein of 
483 amino acids, comprising exons 1-4. Predicted 
molecular weight of the protein, prior to any post- 
translational modification, is 55.1 kD. The inclusion of a 
5 novel exon between exons 2 and 3 introduces an inframe stop 
codon in MDZ12b, and thus MDZ12b encodes a short polypeptide 
of 44 amino acids (MDZ12bS) . The use of an internal 
methionine as initiation methionine in MDZ12b could 
potentially further encode a 332 amino acid protein 
10 (MDZ12bL) . The predicted molecular weight of the MDZ12bL 
protein, prior to any post-translational modification, is 
38.2 kD. 

As further discussed in the examples herein, 
expression of MDZ12 was assessed using RT-PCR. The abundance 

15 of PCR product indicates that MDZ12a is expressed in all 

tissues examined, with highest expression in brain, heart, 
skeletal muscle, testis and Hela cells. MDZ12b, however, is 
expressed with lower to much lower abundance compared with 
MDZ12a in bone marrow, brain, heart, kidney, placenta, 

20 skeletal muscle, testis and Hela cells with almost no 
expression in liver. 

As more fully described below, the present 
invention provides isolated nucleic acids that encode MDZ3, 
MDZ4, MDZ7, MDZ12a and MDZ12b and fragments thereof. The 

25 invention further provides vectors for propagation and 

expression of the nucleic acids of the present invention, 
host cells comprising the nucleic acids and vectors of the 
present invention, proteins, protein fragments, and protein 
fusions of the present invention, and antibodies specific 

30 for all or any one of the isoforms. The invention provides 
pharmaceutical formulations of the nucleic acids, proteins, 
and antibodies of the present invention. The invention 
further provides transgenic cells and non-human organisms 
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comprising human MDZ3, MDZ4 , MDZ7 or MDZ12 nucleic acids, 
and transgenic cells and non-human organisms with targeted 
disruption of the endogenous orthologue of the human MDZ3, 
MDZ4, MDZ7 or MDZ12 . The invention additionally provides 
5 diagnostic, investigational, and therapeutic methods based 
on the MDZ3, MDZ4, MDZ7, MDZ12 nucleic acids, proteins, and 
antibodies of the present invention. 



DEFINITIONS 

10 

Unless defined otherwise, all technical and 
scientific terms used herein have the meaning commonly 
understood by one of ordinary skill in the art to which this 
invention belongs. 

15 As used herein, "nucleic acid" (synonymously, 

"polynucleotide") includes polynucleotides having natural 
nucleotides in native 5 T -3 T phosphodiester linkage — e.g., 
DNA or RNA — as well as polynucleotides that have nonnatural 
nucleotide analogues, nonnative internucleoside bonds, or 

20 both, so long as the nonnatural polynucleotide is capable of 
sequence-discriminating basepairing under experimentally 
desired conditions. Unless otherwise specified, the term 
"nucleic acid" includes any topological conformation; the 
term thus explicitly comprehends single-stranded, double- 

25 stranded, partially duplexed, triplexed, hairpinned, 
circular, and padlocked conformations. 

As used herein, an "isolated nucleic acid" is a 
nucleic acid molecule that exists in a physical form that is 
nonidentical to any nucleic acid molecule of identical 

30 sequence as found in nature; "isolated" does not require, 
although it does not prohibit, that the nucleic acid so 
described has itself been physically removed from its native 
environment . 
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For example, a nucleic acid can be said to be 
"isolated" when it includes nucleotides and/or 
internucleoside bonds not found in nature. When instead 
composed of natural nucleosides in phosphodiester linkage, a 
5 nucleic acid can be said to be "isolated" when it exists at 
a purity not found in nature, where purity can be adjudged 
with respect to the presence of nucleic acids of other 
sequence, with respect to the presence of proteins, with 
respect to the presence of lipids, or with respect the 

10 presence of any other component of a biological cell, or 
when the nucleic acid lacks sequence that flanks an 
otherwise identical sequence in an organism's genome, or 
when the nucleic acid possesses sequence not identically 
present in nature. 

15 As so defined, "isolated nucleic acid" includes 

nucleic acids integrated into a host cell chromosome at a 
heterologous site, recombinant fusions of a native fragment 
to a heterologous sequence, recombinant vectors present as 
episomes or as integrated into a host cell chromosome. 

20 As used herein, an isolated nucleic acid "encodes" 

a reference polypeptide when at least a portion of the 
nucleic acid, or its complement, can be directly translated 
to provide the amino acid sequence of the reference 
polypeptide, or when the isolated nucleic acid can be used, 

25 alone or as part of an expression vector, to express the 

reference polypeptide in vitro, in a prokaryotic host cell, 
or in a eukaryotic host cell. 

As used herein, the term "exon" refers to a 
nucleic acid sequence found in genomic DNA that is 

30 bioinf ormatically predicted and/or experimentally confirmed 
to contribute contiguous sequence to a mature mRNA 
transcript . 
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As used herein, the phrase "open reading frame" 

and the equivalent acronym "ORF" refer to that portion of a 
transcript-derived nucleic acid that can be translated in 
its entirety into a sequence of contiguous amino acids. As 
5 so defined/ an ORF has length, measured in nucleotides, 
exactly divisible by 3 . As so defined, an ORF need not 
encode the entirety of a natural protein. 

As used herein, the phrase "ORF -en coded peptide" 
refers to the predicted or actual translation of an ORF. 

10 As used herein, the phrase "degenerate variant" of 

a reference nucleic acid sequence intends all nucleic acid 
sequences that can be directly translated, using the 
standard genetic code, to provide an amino acid sequence 
identical to that translated from the reference nucleic acid 

15 sequence . 

As used herein, the term "microarray" and the 
equivalent phrase "nucleic acid microarray" refer to a 
substrate-bound collection of plural nucleic acids, 
hybridization to each of the plurality of bound nucleic 
20 acids being separately detectable. The substrate can be , 
solid or porous, planar or non-planar, unitary or 
distributed. 

As so defined, the term "microarray" and phrase 
"nucleic acid microarray" include all the devices so called 

25 in Schena (ed.), DNA Microarrays : A Practical Approach 

(Practical Approach Series ) , Oxford University Press (1999) 
(ISBN: 0199637768); Nature Genet. 21 (1) (suppl) : 1 - 60 
(1999); and Schena (ed.), Microarray Biochip: Tools and 
Technology , Eaton Publishing Company/BioTechniques Books 

30 Division (2000) (ISBN: 1881299376), the disclosures of which 
are incorporated herein by reference in their entireties. 

As so defined, the term "microarray" and phrase 
"nucleic acid microarray" also include substrate-bound 



collections of plural nucleic acids in which the plurality 
of nucleic acids are distributably disposed on a plurality 
of beads, rather than on a unitary planar substrate, as is 
described, inter alia, in Brenner et al . , Proc. Natl. Acad. 
Sci. USA 97 (4) : 166501670 (2000), the disclosure of which is 
incorporated herein by reference in its entirety; in such 
case, the term "microarray" and phrase "nucleic acid 
microarray" refer to the plurality of beads in aggregate. 

As used herein with respect to solution phase 
hybridization, the term "probe", or equivalently, "nucleic 
acid probe" or "hybridization probe", refers to an isolated 
nucleic acid of known sequence that is, or is intended to 
be, detectably labeled. As used herein with respect to a 
nucleic acid microarray, the term "probe" (or equivalently 
"nucleic acid probe" or "hybridization probe") refers to the 
isolated nucleic acid that is, or is intended to be, bound 
to the substrate. In either such context, the term "target" 
refers to nucleic acid intended to be bound to probe by 
sequence complementarity. 

As used herein, the expression "probe comprising 
SEQ ID NO:X", and variants thereof, intends a nucleic acid 
probe, at least a portion of which probe has either (i) the 
sequence directly as given in the referenced SEQ ID N0:X, or 
(ii) a sequence complementary to the sequence as given in 
the referenced SEQ ID NO:X, the choice as between sequence 
directly as given and complement thereof dictated by the 
requirement that the probe be complementary to the desired 
target . 

As used herein, the phrases "expression of a 
probe" and "expression of an isolated nucleic acid" and 

their linguistic equivalents intend that the probe or, 
(respectively, the isolated nucleic acid) , or a probe (or, 
respectively, isolated nucleic acid) complementary in 



sequence thereto, can hybridize detectably under high 
stringency conditions to a sample of nucleic acids that 
derive from mRNA transcripts from a given source. For 
example, and by way of illustration only, expression of a 
probe in "liver" means that the probe can hybridize 
detectably under high stringency conditions to a sample of 
nucleic acids that derive from mRNA obtained from liver. 

As used herein, "a single exon probe" comprises at 
least part of an exon ("reference exon") and can hybridize 
detectably under high stringency conditions to 
transcript-derived nucleic acids that include the reference 
exon. The single exon probe will not, however, hybridize 
detectably under high stringency conditions to nucleic acids 
that lack the reference exon and that instead consist of one 
or more exons that are found adjacent to the reference exon 
in the genome. 

For purposes herein, "high stringency conditions" 
are defined for solution phase hybridization as aqueous 
hybridization (i.e., free of formamide) in 6X SSC (where 20X 
SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% SDS at 
65°C for at least 8 hours, followed by one or more washes in 
0.2X SSC, 0.1% SDS at 65°C. "Moderate stringency 
conditions" are defined for solution phase hybridization as 
aqueous hybridization (i.e., free of formamide) in 6X SSC, 
1% SDS at 65°C for at least 8 hours, followed by one or more 
washes in 2x SSC, 0.1% SDS at room temperature. 

For microarray-based hybridization, standard "high 
stringency conditions" are defined as hybridization in 50% 
formamide, 5X SSC, 0.2 ]ig/\il poly(dA), 0.2 pg/ul human cotl 
DNA, and 0.5% SDS, in a humid oven at 42°C overnight, 
followed by successive washes of the microarray in IX SSC, 
0.2% SDS at 55°C for 5 minutes, and then 0 . IX SSC, 0.2% SDS, 



at 55°C for 20 minutes. For microarray-based hybridization, 
"moderate stringency conditions", suitable for 
cross-hybridization to mRNA encoding structurally- and 
functionally-related proteins, are defined to be the same as 
those for high stringency conditions but with reduction in 
temperature for hybridization and washing to room 
temperature (approximately 25°C) . 

As used herein, the terms "protein", 
"polypeptide", and "peptide" are used interchangeably to 
refer to a naturally-occurring or synthetic polymer of amino 
acid monomers (residues), irrespective of length, where 
amino acid monomer here includes naturally-occurring amino 
acids, naturally-occurring amino acid structural variants, 
and synthetic non-naturally occurring analogs that are 
capable of participating in peptide bonds. The terms 
"protein", "polypeptide", and "peptide" explicitly permits 
of post-translational and post-synthetic modifications, such 
as glycosylation. 

The term "oligopeptide" herein denotes a protein, 
polypeptide, or peptide having 25 or fewer monomeric 
subunits . 

The phrases "isolated protein", "isolated 
polypeptide", "isolated peptide" and "isolated oligopeptide" 

refer to a protein (or respectively to a polypeptide, 
peptide, or oligopeptide) that is nonidentical to any 
protein molecule of identical amino acid sequence as found 
in nature; "isolated" does not require, although it does not 
prohibit, that the protein so described has itself been 
physically removed from its native environment. 

For example, a protein can be said to be 
"isolated" when it includes amino acid analogues or 
derivatives not found in nature, or includes linkages other 
than standard peptide bonds. 



When instead composed entirely of natural amino 
acids linked by peptide bonds, a protein can be said to be 
"isolated" when it exists at a purity not found in nature — 
where purity can be adjudged with respect to the presence of 
proteins of other sequence, with respect to the presence of 
non-protein compounds, such as nucleic acids, lipids, or 
other components of a biological cell, or when it exists in 
a composition not found in nature, such as in a host 
cell that does not naturally express that protein. 

A "purified protein" (equally, a purified 
polypeptide, peptide, or oligopeptide) is an isolated 
protein, as above described, present at a concentration of 
at least 95%, as measured on a weight basis with respect to 
total protein in a composition. A "substantially purified 
protein" (equally, a substantially purified polypeptide, 
peptide, or oligopeptide) is an isolated protein, as above 
described, present at a concentration of at least 70%, as 
measured on a weight basis with respect to total protein in 
a composition. 

As used herein, the phrase "protein isoforms" 
refers to a plurality of proteins having nonidentical 
primary amino acid sequence but that share amino acid 
sequence encoded by at least one common exon. 

As used herein, the phrase "alternative splicing" 
and its linguistic equivalents includes all types of RNA 
processing that lead to expression of plural protein 
isoforms from a single gene; accordingly, the phrase "splice 
variant (s) " and its linguistic equivalents embraces mRNAs 
transcribed from a given gene that, however processed, 
collectively encode plural protein isoforms. For example, 
and by way of illustration only, splice variants can include 
exon insertions, exon extensions, exon truncations, exon 
deletions, alternatives in the 5 1 untranslated region ("5 T 



UT") and alternatives in the 3 ! untranslated region 
("3 1 UT"). Such 3 1 alternatives include, for example, 
differences in the site of RNA transcript cleavage and site 
of poly (A) addition. See, e.g., Gautheret et al. r Genome 
Res. 8 : 524-530 (1998) . 

As used herein, "orthologues" are separate 
occurrences of the same gene in multiple species. The 
separate occurrences have similar, albeit nonidentical, 
amino acid sequences, the degree of sequence similarity 
depending, in part, upon the evolutionary distance of the 
species from a common ancestor having the same gene. 

As used herein, the term "paralogues" indicates 
separate occurrences of a gene in one species. The separate 
occurrences have similar, albeit nonidentical, amino acid 
sequences, the degree of sequence similarity depending, in 
part, upon the evolutionary distance from the gene 
duplication event giving rise to the separate occurrences. 

As used herein, the term "homologues" is generic 
to "orthologues" and "paralogues". 

As used herein, the term "antibody" refers to a 
polypeptide, at least a portion of which is encoded by at 
least one immunoglobulin gene, or fragment thereof, and that 
can bind specifically to a desired target molecule. The 
term includes naturally-occurring forms, as well as 
fragments and derivatives. 

Fragments within the scope of the term "antibody" 
include those produced by digestion with various proteases, 
those produced by chemical cleavage and/or chemical 
dissociation, and those produced recombinantly, so long as 
the fragment remains capable of specific binding to a target 
molecule. Among such fragments are Fab, Fab', Fv, F(ab) T 2 , 
and single chain Fv (scFv) fragments. 
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Derivatives within the scope of the term include 
antibodies (or fragments thereof) that have been modified in 
sequence, but remain capable of specific binding to a target 
molecule, including: interspecies chimeric and humanized 
5 antibodies; antibody fusions; heteromeric antibody complexes 
and antibody fusions, such as diabodies (bispecific 
antibodies) , single-chain diabodies, and intrabodies (see, 
e.g., Marasco (ed.), Intracellular Antibodies: Research and 
Disease Applications , Springer-Verlag New York, Inc. (1998) 
10 (ISBN: 3540641513), the disclosure of which is incorporated 
herein by reference in its entirety) . 

As used herein, antibodies can be produced by any 
known technique, including harvest from cell culture of 
native B lymphocytes, harvest from culture of hybridomas, 
15 recombinant expression systems, and phage display. 

As used herein, "antigen" refers to a ligand that 
can be bound by an antibody; an antigen need not itself be 
immunogenic. The portions of the antigen that make contact 
with the antibody are denominated "epitopes". 
20 "Specific binding" refers to the ability of two 

molecular species concurrently present in a heterogeneous 
(inhomogeneous) sample to bind to one another in preference 
to binding to other molecular species in the sample. 
Typically, a specific binding interaction will discriminate 
25 over adventitious binding interactions in the reaction by at 
least two-fold, more typically by at least 10-fold, often at 
least 100-fold; when used to detect analyte, specific 
binding is sufficiently discriminatory when determinative of 
the presence of the analyte in a heterogeneous 
30 (inhomogeneous) sample. Typically, the affinity or avidity 
of a specific binding reaction is least about 10" 7 M, with 
specific binding reactions of greater specificity typically 
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having affinity or avidity of at least 1CT 8 M to at least 
about 10" 9 M. 

As used herein, "molecular binding partners" — and 
equivalently, "specific binding partners" - refer to pairs 
5 of molecules, typically pairs of biomolecules, that exhibit 
specific binding. Nonlimiting examples are receptor and 
ligand, antibody and antigen, and biotin to any of avidin, 
streptavidin, neutrAvidin and captAvidin. 

The term "antisense", as used herein, refers to a 
10 nucleic acid molecule sufficiently complementary in 

sequence, and sufficiently long in that complementary 
sequence, as to hybridize under intracellular conditions to 
(i) a target mRNA transcript or (ii) the genomic DNA strand 
complementary to that transcribed to produce the target mRNA 
15 transcript . 

The term "portion", as used with respect to 
nucleic acids, proteins, and antibodies, is synonymous with 
"fragment" . 

2 0 NUCLEIC ACID MOLECULES 

In a first aspect, the invention provides isolated 
nucleic acids that encode MDZ3, MDZ4, MDZ7 or MDZ12, 
variants having at least 65% sequence identity thereto, 

25 degenerate variants thereof, variants that encode MDZ3, 
MDZ4, MDZ7 or MDZ12 proteins having conservative or 
moderately conservative substitutions, cross-hybridizing 
nucleic acids, and fragments thereof. 

FIGS. 3, 6, 9, 12 and 13 present the nucleotide 

30 sequences of the MDZ3, MDZ4, MDZ7, MDZ12a and MDZ12b cDNA 
clones, respectively, with predicted amino acid 
translations. The sequences are further presented in the 
Sequence Listing, incorporated herein by reference in its 



entirety, in SEQ ID NOs : 1 (full length nucleotide sequence 
of human MDZ3 cDNA) , 3 (full length amino acid sequence of 
MDZ3) , 3027 (full length nucleotide sequence of human MDZ4 
cDNA) , 3029 (full length amino acid sequence of MDZ4), 4407 
(full length nucleotide sequence of human MDZ7 cDNA) , 4409 
(full length amino acid sequence of MDZ7), 5770 (full length 
nucleotide sequence of human MDZ12a cDNA) , 5772 (full length 
amino acid sequence of MDZ12a) , 6938 (full length nucleotide 
sequence of human MDZ12b cDNA) , 6939 (full length amino acid 
sequence of MDZ12bS) and 6940 (full length amino acid 
sequence of MDZ12bL) . 

Unless otherwise indicated, each nucleotide 
sequence is set forth herein as a sequence of 
deoxyribonucleotides . It is intended, however, that the 
given sequence be interpreted as would be appropriate to the 
polynucleotide composition: for example, if the isolated 
nucleic acid is composed of RNA, the given sequence intends 
ribonucleotides, with uridine substituted for thymidine. 

Unless otherwise indicated, nucleotide sequences 
of the isolated nucleic acids of the present invention were 
determined by sequencing a DNA molecule that had resulted, 
directly or indirectly, from at least one enzymatic 
polymerization reaction (e.g., reverse transcription and/or 
polymerase chain reaction) using an automated sequencer 
(such as the MegaBACE™ 1000, Molecular Dynamics, Sunnyvale, 
CA, USA) , or by reliance upon such sequence or upon genomic 
sequence prior-accessioned into a public database. Unless 
otherwise indicated, all amino acid sequences of the 
polypeptides of the present invention were predicted by 
translation from the nucleic acid sequences so determined. 

As a consequence, any nucleic acid sequence 
presented herein may contain errors introduced by erroneous 
incorporation of nucleotides during polymerization, by 



erroneous base calling by the automated sequencer (although 
such sequencing errors have been minimized for the nucleic 
acids directly determined herein, unless otherwise 
indicated, by the sequencing of each of the complementary 
strands of a duplex DNA) , or by similar errors accessioned 
into the public database. 

Accordingly, the MDZ3, MDZ4, MDZ7, MDZ12a and 
MDZ12b cDNA clones described herein have been separately 
deposited in a public repository (American Type Culture 
Collection, Manassas, Virginia, USA, "ATCC") . Each of the 
MDZ3, MDZ4, MDZ7 cDNA clones was sent for deposit to ATCC in 
a discrete tube on August 1, 2001 and received by ATCC 
August 2, 2001, and respectively accorded accession numbers 

of / t . The two 

splice variants of the MDZ12 gene - MDZ12a and MDZ12b - were 
sent for deposit in admixture in a single tube to ATCC on 
August 1, 2001, received at ATCC on August 2, 2001, and 

jointly accorded the single accession number . 

Any errors in sequence reported herein can be determined and 
corrected by sequencing nucleic acids propagated from the 
deposited clones using standard techniques. 

Single nucleotide polymorphisms (SNPs) occur 
frequently in eukaryotic genomes - more than 1.4 million 
SNPs have already identified in the human genome, 
International Human Genome Sequencing Consortium, Nature 
409:860 - 921 (2001) - and the sequence determined from one 
individual of a species may differ from other allelic forms 
present within the population. Additionally, small 
deletions and insertions, rather than single nucleotide 
polymorphisms, are not uncommon in the general population, 
and often do not alter the function of the protein. 

Accordingly, it is an aspect of the present 
invention to provide nucleic acids not only identical in 
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sequence to those described with particularity herein, but 
also to provide isolated nucleic acids at least about 65% 
identical in sequence to those described with particularity 
herein, typically at least about 70%, 75%, 80%, 85%, or 90% 
5 identical in sequence to those described with particularity 
herein, usefully at least about 91%, 92%, 93%, 94%, or 95% 
identical in sequence to those described with particularity 
herein, usefully at least about 96%, 97%, 98%, or 99% 
identical in sequence to those described with particularity 

10 herein, and, most conservatively, at least about 99,5%, 

99.6%, 99.7%, 99.8% and 99.9% identical in sequence to those 
described with particularity herein. These sequence 
variants can be naturally occurring or can result from human 
intervention, as by random or directed mutagenesis. 

15 For purposes herein, percent identity of two 

nucleic acid sequences is determined using the procedure of 
Tatiana et a J . , "Blast 2 sequences - a new tool for 
comparing protein and nucleotide sequences", FEMS Microbiol 
Lett. 174:247-250 (1999), which procedure is effectuated by 

20 the computer program BLAST 2 SEQUENCES, available online at 

http: //www. ncbi .nlm.nih.gov/blast/bl2seq/bl2 .html . 

To assess percent identity of nucleic acids, the BLASTN 
25 module of BLAST 2 SEQUENCES is used with default values of 
(i) reward for a match: 1; (ii) penalty for a mismatch: -2; 
(iii) open gap 5 and extension gap 2 penalties; (iv) gap 
X_dropoff 50 expect 10 word size 11 filter, and both 
sequences are entered in their entireties. 
30 As is well known, the genetic code is degenerate, 

with each amino acid except methionine translated from a 
plurality of codons, thus permitting a plurality of nucleic 
acids of disparate sequence to encode the identical protein. 



As is also well known, codon choice for optimal expression 
varies from species to species. The isolated nucleic acids 
of the present invention being useful for expression of 
MDZ3, MDZ4 , MDZ7 or MDZ12 proteins and protein fragments, it 
is, therefore, another aspect of the present invention to 
provide isolated nucleic acids that encode MDZ3, MDZ4, MDZ7 
or MDZ12 proteins and portions thereof not only identical in 
sequence to those described with particularity herein, but 
degenerate variants thereof as well. 

As is also well known, amino acid substitutions 
occur frequently among natural allelic variants, with 
conservative substitutions often occasioning only de minimis 
change in protein function. 

Accordingly, it is an aspect of the present 
invention to provide nucleic acids not only identical in 
sequence to those described with particularity herein, but 
also to provide isolated nucleic acids that encode MDZ3, 
MDZ4, MDZ7 or MDZ12, and portions thereof, having 
conservative amino acid substitutions, and also to provide 
isolated nucleic acids that encode MDZ3, MDZ4, MDZ7 or 
MDZ12, and portions thereof, having moderately conservative 
amino acid substitutions. 

Although there are a variety of metrics for 
calling conservative amino acid substitutions, based 
primarily on either observed changes among evolutionarily 
related proteins or on predicted chemical similarity, for 
purposes herein a conservative replacement is any change 
having a positive value in the PAM250 log-likelihood matrix 
reproduced herein below (see Gonnet et al. f Science 
25 6 (50 62) : 1443-5 (1992) ) : 
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For purposes herein, a "moderately conservative" replacement 
is any change having a nonnegative value in the PAM250 log- 
likelihood matrix reproduced herein above. 

As is also well known in the art, relatedness of 
nucleic acids can also be characterized using a functional 
test, the ability of the two nucleic acids to base-pair to 
one another at defined hybridization stringencies. 

It is, therefore, another aspect of the invention 
to provide isolated nucleic acids not only identical in 
sequence to those described with particularity herein, but 
also to provide isolated nucleic acids ("cross-hybridizing 
nucleic acids") that hybridize under high stringency 
conditions (as defined herein below) to all or to a portion 
of various of the isolated MDZ3, MDZ4, MDZ7 or MDZ12 nucleic 



acids of the present invention ("reference nucleic acids"), 
as well as cross-hybridizing nucleic acids that hybridize 
under moderate stringency conditions to all or to a portion 
of various of the isolated MDZ3 , MDZ4, MDZ7 or MDZ12 nucleic 
acids of the present invention. 

Such cross-hybridizing nucleic acids are useful, 
inter alia, as probes for, and to drive expression of, 
proteins related to the proteins of the present invention as 
alternative isoforms, homologues, paralogues, and 
orthologues. Particularly useful orthologues are those from 
other primate species, such as chimpanzee, rhesus macaque, 
monkey, baboon, orangutan, and gorilla; from rodents, such 
as rats, mice, guinea pigs; from lagomorphs, such as 
rabbits; and from domestic livestock, such as cow, pig, 
sheep, horse, goat and chickens. 

For purposes herein, high stringency conditions 
are defined as aqueous hybridization (i.e., free of 
formamide) in 6X SSC (where 20X SSC contains 3.0 M NaCl and 
0.3 M sodium citrate), 1% SDS at 65°C for at least 8 hours, 
followed by one or more washes in 0.2X SSC, 0.1% SDS at 
65°C. For purposes herein, moderate stringency conditions 
are defined as aqueous hybridization (i.e., free of 
formamide) in 6X SSC, 1% SDS at 65°C for at least 8 hours, 
followed by one or more washes in 2x SSC, 0.1% SDS at room 
temperature . 

The hybridizing portion of the reference nucleic 
acid is typically at least 15 nucleotides in length, often 
at least 17 nucleotides in length. Often, however, the 
hybridizing portion of the reference nucleic acid is at 
least 20 nucleotides in length, 25 nucleotides in length, 
and even 30 nucleotides, 35 nucleotides, 40 nucleotides, and 
50 nucleotides in length. Of course, cross-hybridizing 
nucleic acids that hybridize to a larger portion of the 
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reference nucleic acid - for example, to a portion of at 
least 50 nt, at least 100 nt, at least 150 nt, 200 nt, 250 
nt, 300 nt, 350 nt, 400 nt, 450 nt, or 500 nt or more - or 
even to the entire length of the reference nucleic acid, are 
5 also useful. 

The hybridizing portion of the cross-hybridizing 
nucleic acid is at least 75% identical in sequence to at 
least a portion of the reference nucleic acid. Typically, 
the hybridizing portion of the cross-hybridizing nucleic 
10 acid is at least 80%, often at least 85%, 86%, 87%, 88%, 89% 
or even at least 90% identical in sequence to at least a 
portion of the reference nucleic acid. Often, the 
hybridizing portion of the cross-hybridizing nucleic acid 
will be at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 
15 99% identical in sequence to at least a portion of the 

reference nucleic acid sequence. At times, the hybridizing 
portion of the cross-hybridizing nucleic acid will be at 
least 99.5% identical in sequence to at least a portion of 
the reference nucleic acid. 
20 The invention also provides fragments of various 

of the isolated nucleic acids of the present invention. 

By "fragments" of a reference nucleic acid is here 
intended isolated nucleic acids, however obtained, that have 
a nucleotide sequence identical to a portion of the 
25 reference nucleic acid sequence, which portion is at least 
17 nucleotides and less than the entirety of the reference 
nucleic acid. As so defined, "fragments" need not be 
obtained by physical fragmentation of the reference nucleic 
acid, although such provenance is not thereby precluded. 
50 In theory, an oligonucleotide of 17 nucleotides is 

of sufficient length as to occur at random less frequently 
than once in the three gigabase human genome, and thus to 
provide a nucleic acid probe that can uniquely identify the 



reference sequence in a nucleic acid mixture of genomic 
complexity. As is well known, further specificity can be 
obtained by probing nucleic acid samples of subgenomic 
complexity, and/or by using plural fragments as short as 17 
nucleotides in length collectively to prime amplification of 
nucleic acids, as, e.g., by polymerase chain reaction (PGR). 

As further described herein below, nucleic acid 
fragments that encode at least 6 contiguous amino acids 
(i.e., fragments of 18 nucleotides or more) are useful in 
directing the expression or the synthesis of peptides that 
have utility in mapping the epitopes of the protein encoded 
by the reference nucleic acid. See, e.g., Geysen et al . , 
"Use of peptide synthesis to probe viral antigens for 
epitopes to a resolution of a single amino acid," Proc. 
Natl. Acad. Scl. USA 81:3998-4002 (1984); and U.S. Pat. Nos. 
4,708,871 and 5,595,915, the disclosures of which are 
incorporated herein by reference in their entireties. 

As further described herein below, fragments that 
encode at least 8 contiguous amino acids (i.e., fragments of 
24 nucleotides or more) are useful in directing the 
expression or the synthesis of peptides that have utility as 
immunogens. See, e.g., Lerner, "Tapping the immunological 
repertoire to produce antibodies of predetermined 
specificity," Nature 299:592-596 (1982); Shinnick et al . , 
"Synthetic peptide immunogens as vaccines," Annu. Rev. 
Microbiol. 37:425-46 (1983); Sutcliffe et al . , "Antibodies 
that react with predetermined sites on proteins, " Science 
219:660-6 (1983), the disclosures of which are incorporated 
herein by reference in their entireties. 

The nucleic acid fragment of the present invention 
is thus at least 17 nucleotides in length, typically at 
least 18 nucleotides in length, and often at least 24 
nucleotides in length. Often, the nucleic acid of the 
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present invention is at least 25 nucleotides in length, and 
even 30 nucleotides, 35 nucleotides, 40 nucleotides, or 45 
nucleotides in length. Of course, larger fragments having 
at least 50 nt, at least 100 nt, at least 150 nt, 200 nt, 
250 nt, 300 nt, 350 nt, 400 nt, 450 nt, or 500 nt or more 
are also useful, and at times preferred. 

Having been based upon the mining of genomic 
sequence, rather than upon surveillance of expressed 
message, the present invention further provides isolated 
genome-derived nucleic acids that include portions of the 
MDZ3, MDZ4, MDZ7 or MDZ12 gene. 

The invention particularly provides genome-derived 
single exon probes. 

As further described in commonly owned and 
copending U.S. patent application serial nos. 09/864,761, 
filed May 23, 2001; 09/774,203, filed January 29, 2001; and 
09/632,366, filed August 3, 2000, the disclosures of which 
are incorporated herein by reference in their entireties, "a 
single exon probe" comprises at least part of an exon 
("reference exon") and can hybridize detectably under high 
stringency conditions to transcript-derived nucleic acids 
that include the reference exon. The single exon probe will 
not, however, hybridize detectably under high stringency 
conditions to nucleic acids that lack the reference exon and 
instead consist of one or more exons that are found adjacent 
to the reference exon in the genome. 

Genome-derived single exon probes typically 
further comprise, contiguous to a first end of the exon 
portion, a first intronic and/or intergenic sequence that is 
identically contiguous to the exon in the genome. Often, 
the genome-derived single exon probe further comprises, 
contiguous to a second end of the exonic portion, a second 
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intronic and/or intergenic sequence that is identically 
contiguous to the exon in the genome. 

The minimum length of genome-derived single exon 
probes is defined by the requirement that the exonic portion 
5 be of sufficient length to hybridize under high stringency 
conditions to transcript-derived nucleic acids. 
Accordingly, the exon portion is at least 17 nucleotides, 
typically at least 18 nucleotides, 20 nucleotides, 24 
nucleotides, 25 nucleotides or even 30, 35, 40, 45, or 50 

10 nucleotides in length, and can usefully include the entirety 
of the exon, up to 100 nt, 150 nt, 200 nt, 250 nt, 300 nt, 
350 nt, 400 nt or even 500 nt or more in length. 

The maximum length of genome-derived single exon 
probes is defined by the requirement that the probes contain 

15 portions of no more than one exon, that is, be unable to 
hybridize detectably under high stringency conditions to 
nucleic acids that lack the reference exon but include one 
or more exons that are found adjacent to the reference exon 
the genome . 

20 Given variable spacing of exons through eukaryotic 

genomes, the maximum length of single exon probes of the 
present invention is typically no more than 25 kb, often no 
more than 20 kb, 15 kb, 10 kb or 7.5 kb, or even no more 
than 5 kb, 4 kb, 3 kb, or even no more than about 2.5 kb in 

25 length. 

The genome-derived single exon probes of the 
present invention can usefully include at least a first 
terminal priming sequence not found in contiguity with the 
rest of the probe sequence in the genome, and often will 
30 contain a second terminal priming sequence not found in 
contiguity with the rest of the probe sequence in the 
genome . 



The present invention also provides isolated 
genome-derived nucleic acids that include nucleic acid 
sequence elements that control transcription of the MDZ3, 
MDZ4, MDZ7 or MDZ12 gene. 

With a complete draft of the human genome now 
available, genomic sequences that are within the vicinity of 
the MDZ3, MDZ4, MDZ7 or MDZ12 coding region (and that are 
additional to those described with particularity herein) can 
readily be obtained by PCR amplification. 

The isolated nucleic acids of the present 
invention can be composed of natural nucleotides in native 
5 T -3 T phosphodiester internucleoside linkage - e.g., DNA or 
RNA - or can contain any or all of nonnatural nucleotide 
analogues, nonnative internucleoside bonds, or post- 
synthesis modifications, either throughout the length of the 
nucleic acid or localized to one or more portions thereof. 

As is well known in the art, when the isolated 
nucleic acid is used as a hybridization probe, the range of 
such nonnatural analogues, nonnative internucleoside bonds, 
or post-synthesis modifications will be limited to those 
that permit sequence-discriminating basepairing of the 
resulting nucleic acid. When used to direct expression or 
RNA or protein in vitro or in vivo, the range of such 
nonnatural analogues, nonnative internucleoside bonds, or 
post-synthesis modifications will be limited to those that 
permit the nucleic acid to function properly as a 
polymerization substrate. When the isolated nucleic acid is 
used as a therapeutic agent, the range of such changes will 
be limited to those that do not confer toxicity upon the 
isolated nucleic acid. 

For example, when desired to be used as probes, 
the isolated nucleic acids of the present invention can 
usefully include nucleotide analogues that incorporate 



labels that are directly detectable, such as radiolabels or 
f luorophores, or nucleotide analogues that incorporate 
labels that can be visualized in a subsequent reaction, such 
as biotin or various haptens. 

Common radiolabeled analogues include those 
labeled with 33 P, 32 P, and 35 S, such as a- 32 P-dATP, a- 32 P-dCTP, 
a- 32 P-dGTP, a- 32 P-dTTP, a- 32 P-3 T dATP, a- 32 P-ATP, a- 32 P-CTP, a- 
32 P-GTP, a- 32 P-UTP, a- 35 S-dATP, y- 35 S-GTP, y- 33 P-dATP, and the 
like . 

Coimnercially available fluorescent nucleotide 
analogues readily incorporated into the nucleic acids of the 
present invention include Cy3-dCTP, Cy3-dUTP, CyS-dCTP, Cy3- 
dUTP (Mersham Pharmacia Biotech, Piscataway, New Jersey, 
USA) , fluorescein- 12-dUTP, tetramethylrhodamine-6-dUTP, 
Texas Red®-5-dUTP, Cascade Blue®-7-dUTP, BODIPY® FL-14-dUTP, 
BODIPY® TMR-14-dUTP, BODIPY® TR-14-dUTP, Rhodamine 
Green™-5-dUTP, Oregon Green® 488-5-dUTP, Texas 
Red®-12-dUTP, BODIPY® 630/650-14-dUTP, BODIPY® 
650/665-14-dUTP, Alexa Fluor® 488-5-dUTP, Alexa Fluor® 
532-5-dUTP, Alexa Fluor® 568-5-dUTP, Alexa Fluor® 
594-5-dUTP, Alexa Fluor® 546-14-dUTP, f luorescein-12-UTP, 
tetramethylrhodamine-6-UTP, Texas Red®-5-UTP, Cascade 
Blue®-7-UTP, BODIPY® FL-14-UTP, BODIPY® TMR-14-UTP, BODIPY® 
TR-14-UTP, Rhodamine Green™-5-UTP, Alexa Fluor® 488-5-UTP, 
Alexa Fluor® 546-14-UTP (Molecular Probes, Inc. Eugene, OR, 
USA) . 

Protocols are available for custom synthesis of 
nucleotides having other f luorophores . Henegariu et al . , 
"Custom Fluorescent-Nucleotide Synthesis as an Alternative 
Method for Nucleic Acid Labeling," Nature Biotechnol . 18:345 
- 348 (2000), the disclosure of which is incorporated herein 
by reference in its entirety. 



Haptens that are commonly conjugated to 
nucleotides for subsequent labeling include biotin 
(biotin-ll-dUTP, Molecular Probes, Inc., Eugene, OR, USA; 
biotin-21-UTP, biotin-21-dUTP, Clontech Laboratories, Inc., 
Palo Alto, CA, USA) , digoxigenin (DIG-ll-dUTP, alkali 
labile, DIG-11-UTP, Roche Diagnostics Corp., Indianapolis, 
IN, USA), and dinitrophenyl (dinitrophenyl-ll-dUTP, 
Molecular Probes, Inc., Eugene, OR, USA). 

As another example, when desired to be used for 
antisense inhibition of transcription or translation, the 
isolated nucleic acids of the present invention can usefully 
include altered, often nuclease-resistant , internucleoside 
bonds. See Hartmann et al . (eds.), Manual of Antisense 
Methodology (Perspectives in Antisense Science), Kluwer Law 
International (1999) ( ISBN: 079238539X) ; Stein et al. 
(eds.), Applied Antisense Oligonucleotide Technology , 
Wiley-Liss (cover (1998) (ISBN: 0471172790); Chadwick et a 1 . 
(eds ->' Oligonucleotides as Therapeutic Agents - Symposium 
No - 2Q9 / John Wiley & Son Ltd (1997) (ISBN: 0471972797), the 
disclosures of which are incorporated herein by reference in 
their entireties. Such altered internucloside bonds are 
often desired also when the isolated nucleic acid of the 
present invention is to be used for or for targeted gene 
correction, Gamper et al . , Nucl. Acids Res. 28 (21 ): 4332-4339 
(2000), the disclosures of which are incorporated herein by 
reference in its entirety. 

Modified oligonucleotide backbones often preferred 
when the nucleic acid is to be used for antisense purposes 
are, for example, phosphorothioates , chiral 
phosphorothioates, phosphorodithioates, phosphotriesters, 
aminoalkylphosphotriesters, methyl and other alkyl 
phosphonates including 3 ? -alkylene phosphonates and chiral 
phosphonates, phosphinates, phosphoramidates including 



3' -amino phosphoramidate and aminoalkylphosphoramidates, 
thionophosphoramidates, thionoalkylphosphonates, 
thionoalkylphosphotriesters, and boranophosphates having 
normal 3 T -5 T linkages, 2 T -5 T linked analogs of these, and 
those having inverted polarity wherein the adjacent pairs of 
nucleoside units are linked 3 T -5 T to 5 T -3 ? or 2 T -5 T to 
5'-2 T . Representative U.S. patents that teach the 
preparation of the above phosphorus-containing linkages 
include, but are not limited to, U.S. Pat. Nos. 3,687,808; 
4,469,863; 4,476,301; 5,023,243; 5,177,196; 5,188,897; 
5,264,423; 5,276,019; 5,278,302; 5,286,717; 5,321,131; 
5,399,676; 5,405,939; 5,453,496; 5,455,233; 5, 466, 677; 
5,476, 925; 5,519,126; 5,536, 821; 5, 541, 306; 5, 550, 111; 
5,563,253; 5,571,799; 5,587,361; and 5,625,050, the 
disclosures of which are incorporated herein by reference in 
their entireties . 

Preferred modified oligonucleotide backbones for 
antisense use that do not include a phosphorus atom have 
backbones that are formed by short chain alkyl or cycloalkyl 
internucleoside linkages, mixed heteroatom and alkyl or 
cycloalkyl internucleoside linkages, or one or more short 
chain heteroatomic or heterocyclic internucleoside linkages. 

These include those having morpholino linkages (formed in 
part from the sugar portion of a nucleoside) ; siloxane 
backbones; sulfide, sulfoxide and sulfone backbones; 
formacetyl and thiof ormacetyl backbones; methylene 
formacetyl and thiof ormacetyl backbones; alkene containing 
backbones; sulfamate backbones; methyleneimino and 
methylenehydrazino backbones; sulfonate and sulfonamide 
backbones; amide backbones; and others having mixed N, O, S 
and CH 2 component parts. Representative U.S. patents that 
teach the preparation of the above backbones include, but 
are not limited to, U.S. Pat. Nos. 5,034,506; 5,166,315; 
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5,185,444; 5,214,134; 5,216,141; 5,235,033; 5,264,562; 

5,264,564; 5,405,938; 5,434,257; 5,466,677; 5,470,967; 

5,489,677; 5,541,307; 5,561,225; 5,596,086; 5,602,240; 

5,610,289; 5,602,240; 5,608,046; 5,610,289; 5,618,704; 
5 5,623,070; 5,663,312; 5,633,360; 5,677,437; and 5, 677, 439, 

the disclosures of which are incorporated herein by 

reference in their entireties. 

In other preferred oligonucleotide mimetics, both 

the sugar and the internucleoside linkage are replaced with 
10 novel groups, such as peptide nucleic acids (PNA) • 

In PNA compounds, the phosphodiester backbone of 

the nucleic acid is replaced with an amide-containing 

backbone, in particular by repeating N- (2-aminoethyl) 

glycine units linked by amide bonds. Nucleobases are bound 
15 directly or indirectly to aza nitrogen atoms of the amide 

portion of the backbone, typically by methylene carbonyl 

linkages . 

The uncharged nature of the PNA backbone provides 
PNA/DNA and PNA/RNA duplexes with a higher thermal stability 

2 0 than is found in DNA/DNA and DNA/RNA duplexes, resulting 

from the lack of charge repulsion between the PNA and DNA or 
RNA strand. In general, the Tm of a PNA/DNA or PNA/RNA 
duplex is 1°C higher per base pair than the Tm of the 
corresponding DNA/DNA or DNA/RNA duplex (in 100 mM NaCl) . 

25 The neutral backbone also allows PNA to form 

stable DNA duplexes largely independent of salt 
concentration. At low ionic strength, PNA can be hybridized 
to a target sequence at temperatures that make DNA 
hybridization problematic or impossible. And unlike DNA/DNA 

30 duplex formation, PNA hybridization is possible in the 
absence of magnesium. Adjusting the ionic strength, 
therefore, is useful if competing DNA or RNA is present in 
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the sample, or if the nucleic acid being probed contains a 
high level of secondary structure, 

PNA also demonstrates greater specificity in 
binding to complementary DNA. A PNA/DNA mismatch is more 
5 destabilizing than DNA/ DNA mismatch. A single mismatch in 
mixed a PNA/DNA 15-mer lowers the Tm by 8-20°C (15°C on 
average) . In the corresponding DNA/ DNA duplexes , a single 
mismatch lowers the Tm by 4-16°C (11°C on average) . Because 
PNA probes can be significantly shorter than DNA probes, 

10 their specificity is greater. 

Additionally, nucleases and proteases do not 
recognize the PNA polyamide backbone with nucleobase 
sidechains. As a result, PNA oligomers are resistant to 
degradation by enzymes, and the lifetime of these compounds 

15 is extended both in vivo and in vitro. In addition, PNA is 
stable over a wide pH range. 

Because its backbone is formed from amide bonds, 
PNA can be synthesized using a modified peptide synthesis 
protocol. PNA oligomers can be synthesized by both Fmoc and 

20 tBoc methods. Representative U.S. patents that teach the 
preparation of PNA compounds include, but are not limited 
to, U.S. Pat. Nos. 5,539,082; 5,714,331; and 5,719,262, each 
of which is hereby incorporated herein by reference; 
automated PNA synthesis is readily achievable on commercial 

25 synthesizers (see, e.g., "PNA User's Guide," Rev. 2, 

February 1998, Perseptive Biosystems Part No. 60138, Applied 
Biosystems, Inc., Foster City, CA) . 

PNA chemistry and applications are reviewed, inter 
alia, in Ray et al . , FASEB J. 14 ( 9) : 1041-60 (2000); Nielsen 

30 et al., Pharmacol Toxicol. 86(1): 3-7 (2000); Larsen et al., 
Biochim Biophys Acta. 14 8 9 ( 1 ) : 159-66 (1999); Nielsen, Curr. 
Opin. Struct. Biol. 9(3):353-7 (1999), and Nielsen, Curr. 
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Opin. Blotechnol. 10(l):71-5 (1999), the disclosures of 
which are incorporated herein by reference in their 
entireties . 

Differences from nucleic acid compositions found 
5 in nature — e.g., nonnative bases, altered internucleoside 
linkages, post-synthesis modification — can be present 
throughout the length of the nucleic acid or can, instead, 
usefully be localized to discrete portions thereof. As an 
example of the latter, chimeric nucleic acids can be 

10 synthesized that have discrete DNA and RNA domains and 

demonstrated utility for targeted gene repair, as further 
described in U.S. Pat. Nos. 5,760,012 and 5,731,181, the 
disclosures of which are incorporated herein by reference in 
their entireties. As another example, chimeric nucleic 

15 acids comprising both DNA and PNA have been demonstrated to 
have utility in modified PCR reactions. See Misra et al., 
Biochem. 37: 1917-1925 (1998); see also Finn et al . , Nucl . 
Acids Res. 24: 3357-3363 (1996), incorporated herein by 
reference . 

20 Unless otherwise specified, nucleic acids of the 

present invention can include any topological conformation 
appropriate to the desired use; the term thus explicitly 
comprehends, among others, single-stranded, double-stranded, 
triplexed, quadruplexed, partially double-stranded, 

25 partially- triplexed, partially-quadruplexed, branched, 

hairpinned, circular, and padlocked conformations. Padlock 
conformations and their utilities are further described in 
Baner et al., Curr. Opin. Biotechnol. 12:11-15 (2001); 
Escude et al . , Proc. Natl. Acad. Sci. USA 14 ; 96 ( 19) : 10603-7 

30 (1999); Nilsson et al . , Science 265 (5181 ): 2085-8 (1994), the 
disclosures of which are incorporated herein by reference in 
their entireties. Triplex and quadruplex conformations, and 
their utilities, are reviewed in Praseuth et al . , Biochim. 
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Biophys. Acta. 1489 (1) : 181-206 (1999); Fox, Curr. Med. Chem. 
7{l):17-37 (2000); Kochetkova et al . , Methods Mol . Biol. 
130:189-201 (2000); Chan et al . , J. Mol. Med. 75(4):267-82 
(1997), the disclosures of which are incorporated herein by 
5 reference in their entireties. 

The nucleic acids of the present invention can be 
detectably labeled. 

Commonly-used labels include radionuclides, such 
as 32 P, 33 P, 35 s, 3 H (and for NMR detection, 13 C and 15 N) , 
10 haptens that can be detected by specific antibody or high 

affinity binding partner (such as avidin) , and f luorophores . 

As noted above, detectable labels can be 
incorporated by inclusion of labeled nucleotide analogues in 
the nucleic acid. Such analogues can be incorporated by 
15 enzymatic polymerization, such as by nick translation, 

random priming, polymerase chain reaction (PCR) , terminal 
transferase tailing, and end-filling of overhangs, for DNA 
molecules, and in vitro transcription driven, e.g., from 
phage promoters, such as T7, T3, and SP6, for RNA molecules. 
>0 Commercial kits are readily available for each such 
labeling approach. 

Analogues can also be incorporated during 
automated solid phase chemical synthesis. 

As is well known, labels can also be incorporated 
!5 after nucleic acid synthesis, with the 5 T phosphate and 3' 
hydroxyl providing convenient sites for post-synthetic 
covalent attachment of detectable labels. 

Various other post-synthetic approaches permit 
internal labeling of nucleic acids. 
0 For example, fluorophores can be attached using a 

cisplatin reagent that reacts with the N7 of guanine 
residues (and, to a lesser extent, adenine bases) in DNA, 
RNA, and PNA to provide a stable coordination complex 
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between the nucleic acid and fluorophore label (Universal 
Linkage System) (available from Molecular Probes, Inc., 
Eugene, OR, USA and Amersham Pharmacia Biotech, Piscataway, 
NJ, USA); see Alers et al . , Genes, Chromosomes & Cancer, 
5 Vol. 25, pp. 301 - 305 (1999); Jelsma et al . , J". NIH Res. 
5:82 (1994); Van Belkum et al . , BioTechniques 16:148-153 
(1994), incorporated herein by reference. As another 
example, nucleic acids can be labeled using a 
disulf ide-containing linker (FastTag™ Reagent, Vector 

10 Laboratories, Inc., Burlingame, CA, USA) that is photo- or 
thermally coupled to the target nucleic acid using aryl 
azide chemistry; after reduction, a free thiol is available 
for coupling to a hapten, fluorophore, sugar, affinity 
ligand, or other marker. 

15 Multiple independent or interacting labels can be 

incorporated into the nucleic acids of the present 
invention. 

For example, both a fluorophore and a moiety that 
in proximity thereto acts to quench fluorescence can be 

20 included to report specific hybridization through release of 
fluorescence quenching, Tyagi et al . , Nature Biotechnol. 14: 
303-308 (1996); Tyagi et al., Nature Biotechnol. 16, 49-53 
(1998); Sokol et al., Proc. Natl. Acad. Sci. USA 95: 
11538-11543 (1998); Kostrikis et al . , Science 279:1228-1229 

25 (1998); Marras et al., Genet. Anal. 14: 151-156 (1999); 

U.S. Pat. Nos. 5,846,726, 5,925,517, 5925517, or to report 
exonucleotidic excision, U.S. Pat. No. 5,538,848; Holland et 
al., Proc. Natl. Acad. Sci. USA 88:7276-7280 (1991); Heid et 
al., Genome Res. 6(10):986-94 (1996); Kuimelis et al., 

30 Nucleic Acids Symp Ser. (37):255-6 (1997); U.S. Patent No. 
5,723,591, the disclosures of which are incorporated herein 
by reference in their entireties. 
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So labeled, the isolated nucleic acids of the 
present invention can be used as probes, as further 
described below* 

Nucleic acids of the present invention can also 
5 usefully be bound to a substrate. The substrate can porous 
or solid, planar or non-planar, unitary or distributed; the 
bond can be covalent or noncovalent. Bound to a substrate, 
nucleic acids of the present invention can be used as probes 
in their unlabeled state. 

10 For example, the nucleic acids of the present 

invention can usefully be bound to a porous substrate, 
commonly a membrane, typically comprising nitrocellulose, 
nylon, or positively-charged derivatized nylon; so attached, 
the nucleic acids of the present invention can be used to 

15 detect MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids present 

within a labeled nucleic acid sample, either a sample of 
genomic nucleic acids or a sample of transcript-derived 
nucleic acids, e.g. by reverse dot blot. 

The nucleic acids of the present invention can 

20 also usefully be bound to a solid substrate, such as glass, 
although other solid materials, such as amorphous silicon, 
crystalline silicon, or plastics, can also be used* Such 
plastics include polymethylacrylic, polyethylene, 
polypropylene, polyacrylate, polymethylmethacrylate, 

25 polyvinyl chloride, polytetraf luoroethylene, polystyrene, 
polycarbonate, polyacetal , polysulf one, celluloseacetate, 
cellulosenitrate, nitrocellulose, or mixtures thereof. 

Typically, the solid substrate will be 
rectangular, although other shapes, particularly disks and 

30 even spheres, present certain advantages. Particularly 
advantageous alternatives to glass slides as support 
substrates for array of nucleic acids are optical discs, as 
described in Demers, "Spatially Addressable Combinatorial 
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Chemical Arrays in CD-ROM Format, 11 international patent 
publication WO 98/12559, incorporated herein by reference in 
its entirety. 

The nucleic acids of the present invention can be 
5 attached covalently to a surface of the support substrate or 
applied to a derivatized surface in a chaotropic agent that 
facilitates denaturation and adherence by presumed 
noncovalent interactions, or some combination thereof. 

The nucleic acids of the present invention can be 

10 bound to a substrate to which a plurality of other nucleic 
acids are concurrently bound, hybridization to each of the 
plurality of bound nucleic acids being separately 
detectable. At low density, e.g. on a porous membrane, 
these substrate-bound collections are typically denominated 

15 macroarrays; at higher density, typically on a solid 

support, such as glass, these substrate bound collections of 
plural nucleic acids are colloquially termed microarrays. 
As used herein, the term microarray includes arrays of all 
densities. It is, therefore, another aspect of the 

2 0 invention to provide microarrays that include the nucleic 
acids of the present invention. 

The isolated nucleic acids of the present 
invention can be used as hybridization probes to detect, 
characterize, and quantify MDZ3, MDZ4, MDZ7 or MDZ12 nucleic 

25 acids in, and isolate MDZ3, MDZ4 , MDZ7 or MDZ12 nucleic 

acids from, both genomic and transcript-derived nucleic acid 
samples. When free in solution, such probes are typically, 
but not invariably, detectably labeled; bound to a 
substrate, as in a microarray, such probes are typically, 

30 but not invariably unlabeled. 

For example, the isolated nucleic acids of the 
present invention can be used as probes to detect and 
characterize gross alterations in the MDZ3 , MDZ4, MDZ7 or 
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MDZ12 genomic locus, such as deletions, insertions, 
translocations, and duplications of the MDZ3 f MDZ4 , MDZ7 or 
MDZ12 genomic locus through fluorescence in situ 
hybridization (FISH) to chromosome spreads. 5ee, e.g., 
5 Andreeff et al . (eds. ) , Introduction to Fluorescence In Situ 
Hybridization: Principles and Clinical Applications , John 
Wiley & Sons (1999) (ISBN: 0471013455), the disclosure of 
which is incorporated herein by reference in its entirety. 
The isolated nucleic acids of the present invention can be 

10 used as probes to assess smaller genomic alterations using, 
e.g., Southern blot detection of restriction fragment length 
polymorphisms. The isolated nucleic acids of the present 
invention can be used as probes to isolate genomic clones 
that include the nucleic acids of the present invention, 

15 which thereafter can be restriction mapped and sequenced to 
identify deletions, insertions, translocations, and 
substitutions (single nucleotide polymorphisms, SNPs) at the 
sequence level. 

The isolated nucleic acids of the present 

20 invention can also be used as probes to detect, 

characterize, and quantify MDZ3, MDZ4, MDZ7 or MDZ12 nucleic 
acids in, and isolate MDZ3, MDZ4, MDZ7 or MDZ12 nucleic 
acids from, transcript-derived nucleic acid samples. 

For example, the isolated nucleic acids of the 

25 present invention can be used as hybridization probes to 
detect, characterize by length, and quantify MDZ3, MDZ4, 
MDZ7 or MDZ12 mRNA by northern blot of total or poly-A + - 
selected RNA samples. For example, the isolated nucleic 
acids of the present invention can be used as hybridization 

30 probes to detect, characterize by location, and quantify 

MDZ3, MDZ4, MDZ7 or MDZ12 message by in situ hybridization 
to tissue sections (see, e.g., Schwarchzacher et al . , In 
Situ Hybridization , Springer-Verlag New York (2000) (ISBN: 



0387915966), the disclosure of which is incorporated herein 
by reference in its entirety) . For example, the isolated 
nucleic acids of the present invention can be used as 
hybridization probes to measure the representation of MDZ3, 
MDZ4, MDZ7 or MDZ12 clones in a cDNA library. For example, 
the isolated nucleic acids of the present invention can be 
used as hybridization probes to isolate MDZ3, MDZ4, MDZ7 or 
MDZ12 nucleic acids from cDNA libraries, permitting sequence 
level characterization of MDZ3, MDZ4, MDZ7 or MDZ12 
messages, including identification of deletions, insertions, 
truncations - including deletions, insertions, and 
truncations of exons in alternatively spliced forms - and 
single nucleotide polymorphisms. 

All of the aforementioned probe techniques are 
well within the skill in the art, and are described at 
greater length in standard texts such as Sambrook et a J . , 
Molecular Cloning: A Laboratory Manual (3 rd ed.), Cold 
Spring Harbor Laboratory Press (2001) (ISBN: 0879695773); 
Ausubel et al . (eds.), Short Protocols in Molecular Biology: 
A Compen dium of Methods from Current Protocols in Molecular 
Biology (4 th ed.), John Wiley & Sons, 1999 (ISBN: 
047132938X) ; and Walker et al. (eds.), The Nucleic Acids 
Protocols Handbook , Humana Press (2000) (ISBN: 0896034593), 
the disclosures of which are incorporated herein by 
reference in their entirety. 

As described in the Examples herein below, the 
nucleic acids of the present invention can also be used to 
detect and quantify MDZ3 , MDZ4, MDZ7 or MDZ12 nucleic acids 
in transcript-derived samples - that is, to measure 
expression of the MDZ3, MDZ4, MDZ7 or MDZ12 gene - when 
included in a microarray. Measurement of MDZ3 , MDZ4, MDZ7 
or MDZ12 expression has particular utility in diagnosis and 
treatment of a variety of diseases, including developmental 
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disorders and cancer, as further described in the Examples 
herein below. 

As would be readily apparent to one of skill in 
the art, each MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acid probe - 
5 whether labeled, substrate-bound, or both — is thus 

currently available for use as a tool for measuring the 
level of MDZ3, MDZ4, MDZ7 or MDZ12 expression in each of the 
tissues in which expression has already been confirmed, 
notably testes for MDZ7, brain, testis, heart and bone 

10 marrow for MDZ3, bone marrow, brain, heart, hela, adult 
liver, fetal liver, lung, placenta and prostate for MDZ4 , 
brain, heart, kidney, placenta, skeletal muscle, testis, 
Hela cells, bone marrow and liver for MDZ12. The utility is 
specific to the probe: under high stringency conditions, the 

15 probe reports the level of expression of message 

specifically containing that portion of the MDZ3, MDZ4, MDZ7 
or MDZ12 gene included within the probe. 

Measuring tools are well known in many arts, not 
just in molecular biology, and are known to possess 

20 credible, specific, and substantial utility. For example, 
U.S. Patent No. 6,016,191 describes and claims a tool for 
measuring characteristics of fluid flow in a hydrocarbon 
well; U.S. Patent No. 6,042,549 describes and claims a 
device for measuring exercise intensity; U.S. Patent No. 

25 5,889,351 describes and claims a device for measuring 

viscosity and for measuring characteristics of a fluid; U.S. 
Patent No. 5,570,694 describes and claims a device for 
measuring blood pressure; U.S. Patent No. 5,930,143 
describes and claims a device for measuring the dimensions 

30 of machine tools; U.S. Patent No. 5,279,044 describes and 
claims a measuring device for determining an absolute 
position of a movable element; U.S. Patent No. 5,186,042 
describes and claims a device for measuring action force of 
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a wheel; and U.S. Patent No. 4,246,774 describes and claims 
a device for measuring the draft of smoking articles such as 
cigarettes . 

As for tissues not yet demonstrated to express 
5 MDZ3, MDZ4, MDZ7 or MDZ12, the MDZ3, MDZ4, MDZ7 or MDZ12 
nucleic acid probes of the present invention are currently 
available as tools for surveying such tissues to detect the 
presence of MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids. 

Survey tools - i.e., tools for determining the 

10 presence and/or location of a desired object by search of an 
area — are well known in many arts, not just in molecular 
biology, and are known to possess credible, specific, and 
substantial utility. For example, U.S. Patent No. 6,046,800 
describes and claims a device for surveying an area for 

15 objects that move; U.S. Patent No. 6,025,201 describes and 
claims an apparatus for locating and discriminating 
platelets from non-platelet particles or cells on a 
cell-by-cell basis in a whole blood sample; U.S. Patent No. 
5,990,689 describes and claims a device for detecting and 

20 locating anomalies in the electromagnetic protection of a 
system; U.S. Patent No. 5,984,175 describes and claims a 
device for detecting and identifying wearable user 
identification units; U.S. Patent No. 3,980,986 ("Oil well 
survey tool"), describes and claims a tool for finding the 

25 position of a drill bit working at the bottom of a borehole. 

As noted above, the nucleic acid probes of the 
present invention are useful in constructing microarrays; 
the microarrays, in turn, are products of manufacture that 
are useful for measuring and for surveying gene expression. 

30 When included on a microarray, each MDZ3, MDZ4, 

MDZ7 or MDZ12 nucleic acid probe makes the microarray 
specifically useful for detecting that portion of the MDZ3, 
MDZ4, MDZ7 or MDZ12 gene included within the probe, thus 
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imparting upon the microarray device the ability to detect a 
signal where, absent such probe, it would have reported no 
signal. This utility makes each individual probe on such 
microarray akin to an antenna, circuit, firmware or software 
5 element included in an electronic apparatus, where the 

antenna, circuit, firmware or software element imparts upon 
the apparatus the ability newly and additionally to detect 
signal in a portion of the radio-frequency spectrum where 
previously it could not; such devices are known to have 

10 specific, substantial, and credible utility. 

Changes in the level of expression need not be 
observed for the measurement of expression to have utility. 

For example, where gene expression analysis is 
used to assess toxicity of chemical agents on cells, the 

15 failure of the agent to change a gene's expression level is 
evidence that the drug likely does not affect the pathway of 
which the gene's expressed protein is a part. Analogously, 
where gene expression analysis is used to assess side 
effects of pharmacologic agents — whether in lead compound 

20 discovery or in subsequent screening of lead compound 

derivatives — the inability of the agent to alter a gene's 
expression level is evidence that the drug does not affect 
the pathway of which the gene's expressed protein is a part. 

WO 99/58720, incorporated herein by reference in 

25 its entirety, provides methods for quantifying the 

relatedness of a first and second gene expression profile 
and for ordering the relatedness of a plurality of gene 
expression profiles, without regard to the identity or 
function of the genes whose expression is used in the 

30 calculation. 

Gene expression analysis, including gene 
expression analysis by microarray hybridization, is, of 
course, principally a laboratory-based art. Devices and 
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apparatus used principally in laboratories to facilitate 
laboratory research are well-established to possess 
specific, substantial, and credible utility. For example, 
U.S. Patent No. 6,001,233 describes and claims a gel 
5 electrophoresis apparatus having a cam-activated clamp; for 
example, U.S. Patent No. 6,051,831 describes and claims a 
high mass detector for use in time-of-f light mass 
spectrometers; for example, U.S. Patent NO. 5,824,269 
describes and claims a flow cytometer— as is well known, few 

10 gel electrophoresis apparatuses, TOF-MS devices, or flow 
cytometers are sold for consumer use. 

Indeed, and in particular, nucleic acid 
microarrays, as devices intended for laboratory use in 
measuring gene expression, are well-established to have 

15 specific, substantial and credible utility. Thus, the 
microarrays of the present invention have at least the 
specific, substantial and credible utilities of the 
microarrays claimed as devices and articles of manufacture 
in the following U.S. patents, the disclosures of each of 

20 which is incorporated herein by reference: U.S. Patent Nos . 
5,445,934 ("Array of oligonucleotides on a solid 
substrate"); 5,744,305 ("Arrays of materials attached to a 
substrate"); and 6,004,752 ("Solid support with attached 
molecules" ) . 

25 Genome-derived single exon probes and genome- 

derived single exon probe microarrays have the additional 
utility, inter alia, of permitting high- throughput detection 
of splice variants of the nucleic acids of the present 
invention, as further described in copending and commonly 

30 owned U.S. Patent application no. 09/632,366, filed August 
3, 2000, the disclosure of which is incorporated herein by 
reference in its entirety. 



- 51 - 



The isolated nucleic acids of the present 
invention can also be used to prime synthesis of nucleic 
acid, for purpose of either analysis or isolation, using 
mRNA, cDNA, or genomic DNA as template. 
5 For use as primers, at least 17 contiguous 

nucleotides of the isolated nucleic acids of the present 
invention will be used. Often, at least 18, 19, or 20 
contiguous nucleotides of the nucleic acids of the present 
invention will be used, and on occasion at least 20, 22, 24, 
10 or 25 contiguous nucleotides of the nucleic acids of the 

present invention will be used, and even 30 nucleotides or 
more of the nucleic acids of the present invention can be 
used to prime specific synthesis. 

The nucleic acid primers of the present invention 
15 can be used, for example, to prime first strand cDNA 
synthesis on an mRNA template. 

Such primer extension can be done directly to 
analyze the message. Alternatively, synthesis on an mRNA 
template can be done to produce first strand cDNA. The 
20 first strand cDNA can thereafter be used, inter alia, 

directly as a single-stranded probe, as above-described, as 
a template for sequencing - permitting identification of 
alterations, including deletions, insertions, and 
substitutions, both normal allelic variants and mutations 
25 associated with abnormal phenotypes- or as a template, 
either for second strand cDNA synthesis (e.g., as an 
antecedent to insertion into a cloning or expression 
vector), or for amplification. 

The nucleic acid primers of the present invention 
30 can also be used, for example, to prime single base 

extension (SBE) for SNP detection (see, e.g., U.S. Pat. No. 
6,004,744, the disclosure of which is incorporated herein by 
reference in its entirety) . 
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As another example, the nucleic acid primers of 
the present invention can be used to prime amplification of 
MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids, using transcript- 
derived or genomic DNA as template. 

Primer-directed amplification methods are now 
well-established in the art. Methods for performing the 
polymerase chain reaction (PCR) are compiled, inter alia, in 
McPherson, PCR (Basics: From Background to Bench) , Springer 
Verlag (2000) ( ISBN: 0387916008); Innis et al . (eds.), PCR 
Applica tions: Protocols for Functional Genomics , Academic 
Press (1999) (ISBN: 0123721857); Gelfand et al . (eds.), PCR 
Strategies, Academic Press (1998) (ISBN: 0123721822); Newton 
et al., PCR , Springer-Verlag New York (1997) (ISBN: 
0387915060); Burke (ed.), PCR: Essential Techniques , John 
Wiley & Son Ltd (1996) (ISBN: 047195697X) ; White (ed.), PCR 
Cloning Protocols : From Molecular Cloning to Genetic 
Engineering, Vol. 67, Humana Press (1996) (ISBN: 
0896033430); McPherson et al . (eds.), PCR 2 : A Practical 

Approach , Oxford University Press, Inc. (1995) (ISBN: 
0199634254), the disclosures of which are incorporated 

herein by reference in their entireties. Methods for 

performing RT-PCR are collected, e.g., in Siebert et al . 
(eds.), Gene Cloning and Analysis by RT-PCR , Eaton 

Publishing Company/Bio Techniques Books Division, 1998 
(ISBN: 1881299147); Siebert (ed.), PCR Technique : RT-PCR , 

Eaton Publishing Company/BioTechniques Books (1995) 

(ISBN: 1881299139) , the disclosure of which is incorporated 

herein by reference in its entirety. 

Isothermal amplification approaches, such as 

rolling circle amplification, are also now well-described. 

See, e.g., Schweitzer et al., Curr. Opin. Biotechnol. 

12(l):21-7 (2001); U.S. Patent Nos. 6,235,502, 6,221,603, 

6,210, 884, 6, 183, 960, 5, 854, 033, 5, 714, 320, 5, 648,245, and 
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international patent publications WO 97/19193 and 
WO 00/15779, the disclosures of which are incorporated 
herein by reference in their entireties. Rolling circle 
amplification can be combined with other techniques to 
facilitate SNP detection. See, e.g., Lizardi et al., Nature 
Genet. 19(3):225-32 (1998). 

As further described below, nucleic acids of the 
present invention, inserted into vectors that flank the 
nucleic acid insert with a phage promoter, such as T7, T3, 
or SP6 promoter, can be used to drive in vitro expression of 
RNA complementary to either strand of the nucleic acid of 
the present invention. The RNA can be used, inter alia, as 
a single-stranded probe, in cDNA-mRNA subtraction, or for in 
vitro translation. 

As will be further discussed herein below, nucleic 
acids of the present invention that encode MDZ3, MDZ4, MDZ7 
or MDZ12 protein or portions thereof can be used, inter 
alia, to express the MDZ3, MDZ4, MDZ7 or MDZ12 proteins or 
protein fragments, either alone, or as part of fusion 
proteins . 

Expression can be from genomic nucleic acids of 
the present invention, or from transcript-derived nucleic 
acids of the present invention. 

Where protein expression is effected from genomic 
DNA, expression will typically be effected in eukaryotic, 
typically mammalian, cells capable of splicing introns from 
the initial RNA transcript. Expression can be driven from 
episomal vectors, such as EBV-based vectors, or can be 
effected from genomic DNA integrated into a host cell 
chromosome. As will be more fully described below, where 
expression is from transcript-derived (or otherwise intron- 
less) nucleic acids of the present invention, expression can 



be effected in wide variety of prokaryotic or eukaryotic 
cells . 

Expressed in vitro, the protein, protein fragment, 
or protein fusion can thereafter be isolated, to be used, 
inter alia, as a standard in immunoassays specific for the 
proteins, or protein isoforms, of the present invention; to 
be used as a therapeutic agent, e.g., to be administered as 
passive replacement therapy in individuals deficient in the 
proteins of the present invention, or to be administered as 
a vaccine; to be used for in vitro production of specific 
antibody, the antibody thereafter to be used, e.g., as an 
analytical reagent for detection and quantitation of the 
proteins of the present invention or to be used as an 
immunotherapeutic agent . 

The isolated nucleic acids of the present 
invention can also be used to drive in vivo expression of 
the proteins of the present invention. In vivo expression 
can be driven from a vector - typically a viral vector, 
often a vector based upon a replication incompetent 
retrovirus, an adenovirus, or an adeno-associated virus 
(AAV) - for purpose of gene therapy. In vivo expression can 
also be driven from signals endogenous to the nucleic acid 
or from a vector, often a plasmid vector, such as pVAXl 
(Invitrogen, Carlsbad CA, USA), for purpose of "naked" 
nucleic acid vaccination, as further described in U.S. Pat. 
Nos. 5,589,466; 5,679,647; 5,804,566; 5,830,877; 5,843,913; 
5,880,104; 5,958,891; 5,985,847; 6,017,897; 6,110,898; 
6,204,250, the disclosures of which are incorporated herein 
by reference in their entireties. 

The nucleic acids of the present invention can 
also be used for antisense inhibition of transcription or 
translation. See Phillips (ed.), Antisense Technology, Part 
B, Methods in Enzymology Vol. 314, Academic Press, Inc. 
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(1999) (ISBN: 012182215X) ; Phillips (ed.), Antisense 
Technology, Part A , Methods in Enzymology Vol. 313, Academi. 
Press, Inc. (1999) (ISBN: 0121822141); Hartmann et al . 
(eds.), Manual of Antisense Methodology (Perspectives in 
Antisense Science), Kluwer Law International (1999) 
(ISBN:079238539X) ; Stein et al. (eels.), Applied Antisense 
Oligonucleotide Technology , Wiley-Liss (cover (1998) (ISBN: 
0471172790); Agrawal et al . (eds.), Antisense Research and 
Application , Springer-Verlag New York, Inc. (1998) (ISBN: 
3540638334); Lichtenstein et al . (eds.), Antisense 
Technology: A Practical Approach , Vol. 185, Oxford 
University Press, INC. (1998) (ISBN: 0199635838); Gibson 
(ed.), Antisense and Ribozyme Methodology: Laboratory 
Companion , Chapman & Hall (1997) (ISBN: 3826100794); 
Chadwick et al . (eds.), Oligonucleotides as Therapeutic 
Agents - Symposium No. 209 , John Wiley & Son Ltd (1997) 
(ISBN: 0471972797), the disclosures of which are 
incorporated herein by reference in their entireties. 

Nucleic acids of the present invention, 
particularly cDNAs of the present invention, that encode 
full-length human MDZ3, MDZ4 , MDZ7 or MDZ12 protein 
isoforms, have additional, well-recognized, immediate, real 
world utility as commercial products of manufacture suitable 
for sale. 

For example, Invitrogen Corp. (Carlsbad, CA, USA), 
through its Research Genetics subsidiary, sells full length 
human cDNAs cloned into one of a selection of expression 
vectors as GeneStorm® expression-ready clones; utility is 
specific for the gene, since each gene is capable of being 
ordered separately and has a distinct catalogue number, and 
utility is substantial, each clone selling for $650.00 US. 
Similarly, Incyte Genomics (Palo Alto, CA, USA) sells clones 



from public and proprietary sources in multi-well plates or 
individual tubes. 

Nucleic acids of the present invention that 
include genomic regions encoding the human MDZ3, MDZ4 , MDZ7 
or MDZ12 protein, or portions thereof, have yet further 
utilities . 

For example, genomic nucleic acids of the present 
invention can be used as amplification substrates, e.g. for 
preparation of genome-derived single exon probes of the 
present invention, as described above and in copending and 
commonly-owned U.S. patent application nos. 09/864,761, 
filed May 23, 2001, 09/774,203, filed January 29, 2001, and 
09/632,366, filed August 3, 2000, the disclosures of which 
are incorporated herein by reference in their entireties. 

As another example, genomic nucleic acids of the 
present invention can be integrated non-homologously into 
the genome of somatic cells, e.g. CHO cells, COS cells, or 
293 cells, with or without amplification of the insertional 
locus, in order, e.g., to create stable cell lines capable 
of producing the proteins of the present invention. 

As another example, more fully described herein 
below, genomic nucleic acids of the present invention can be 
integrated nonhomologously into embryonic stem (ES) cells to 
create transgenic non-human animals capable of producing the 
proteins of the present invention. 

Genomic nucleic acids of the present invention can 
also be used to target homologous recombination to the human 
MDZ3, MDZ4, MDZ7 or MDZ12 locus, respectively. See, e.g., 
U.S. Patent Nos. 6,187,305; 6,204,061; 5,631,153; 5,627,059; 
5,487,992; 5,464,764; 5,614,396; 5,527,695 and 6, 063, 630; 
and Kmiec et al . (eds.), Gene Targeting Protocols , Vol. 133, 
Humana Press (2000) (ISBN: 0896033600); Joyner (ed.), Gene 
Targeting: A Practical Approach , Oxford University Press, 
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Inc. (2000) (ISBN: 0199637938); Sedivy et al . , Gene 
Targeting , Oxford University Press (1998) (ISBN: 
071677013X); Tymms et al . (eds.), Gene Knockout Protocols , 
Humana Press (2000) (ISBN: 0896035727); Mak et al. (eds.), 
5 The Gene Knockout FactsBook , Vol. 2, Academic Press, Inc. 
(1998) (ISBN: 0124660444); Torres et al . , Laboratory 
Protocols for Conditional Gene Targeting , Oxford University 
Press (1997) (ISBN: 019963677X) ; Vega (ed.), Gene Targeting , 
CRC Press, LLC (1994) (ISBN: 084938950X) , the disclosures of 
10 which are incorporated herein by reference in their 
entireties . 

Where the genomic region includes transcription 
regulatory elements, homologous recombination can be used to 
alter the expression of MDZ3, MDZ4, MDZ7 or MDZ12, both for 

15 purpose of in vitro production of MDZ3, MDZ4, MDZ7 or MDZ12 
protein from human cells, and for purpose of gene therapy. 
See, e.g., U.S. Pat. Nos. 5,981,214, 6,048,524; 5,272,071. 

Fragments of the nucleic acids of the present 
invention smaller than those typically used for homologous 

20 recombination can also be used for targeted gene correction 
or alteration, possibly by cellular mechanisms different 
from those engaged during homologous recombination. 

For example, partially duplexed RNA/DNA chimeras 
have been shown to have utility in targeted gene correction, 

25 U.S. Pat. Nos. 5,945,339, 5,888,983, 5,871,984, 5,795,972, 
5,780,296, 5,760,012, 5,756,325, 5,731,181, the disclosures 
of which are incorporated herein by reference in their 
entireties. So too have small oligonucleotides fused to 
triplexing domains have been shown to have utility in 

30 targeted gene correction, Culver et al . , "Correction of 

chromosomal point mutations in human cells with bifunctional 
oligonucleotides," Nature Biotechnol. 17 (10) : 989-93 (1999), 
as have oligonucleotides having modified terminal bases or 



modified terminal internucleoside bonds, Gamper et al . , 
Nucl. Acids Res. 28 (21) : 4332-9 (2000), the disclosures of 
which are incorporated herein by reference. 

The isolated nucleic acids of the present 
invention can also be used to provide the initial substrate 
for recombinant engineering of MDZ3, MDZ4, MDZ7 or MDZ12 
protein variants having desired phenotypic improvements. 
Such engineering includes, for example, site-directed 
mutagenesis, random mutagenesis with subsequent functional 
screening, and more elegant schemes for recombinant 
evolution of proteins, as are described, inter alia, in U.S. 
Pat. Nos. 6,180,406; 6,165,793; 6,117,679; and 6,096,548, 
the disclosures of which are incorporated herein by 
reference in their entireties. 

Nucleic acids of the present invention can be 
obtained by using the labeled probes of the present 
invention to probe nucleic acid samples, such as genomic 
libraries, cDNA libraries, and mRNA samples, by standard 
techniques. Nucleic acids of the present invention can also 
be obtained by amplification, using the nucleic acid primers 
of the present invention, as further demonstrated in Example 
1, herein below. Nucleic acids of the present invention of 
fewer than about 100 nt can also be synthesized chemically, 
typically by solid phase synthesis using commercially 
available automated synthesizers. 

"Full Length" human MDZ3, MDZ4, MDZ7 or MDZ12 
Nucleic Acids 

In a first series of nucleic acid embodiments, the 
invention provides isolated nucleic acids that encode the 
entirety of the MDZ3, MDZ4, MDZ7 or MDZ12a, MDZ12bS, MDZ12bL 
protein. As discussed above, the "full-length" nucleic 



acids of the present invention can be used, inter alia, to 
express full length MDZ3, MDZ4, MDZ7 or MDZ12a, MDZ12bS, 
MDZ12bL protein. The full-length nucleic acids can also be 
used as nucleic acid probes; used as probes, the isolated 
nucleic acids of these embodiments will respectively 
hybridize to MDZ3, MDZ4, MDZ7 or MDZ12, or to portions 
thereof . 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising (i) the nucleotide 
sequence of the nucleic acid of any of ATCC deposits 

(MDZ3), (MDZ4 ) , 

(MDZ7), and (MDZ12a and 

MDZ12b) , (ii) the nucleotide sequence of any of SEQ ID NOs : 
1, 3027, 4407, 5770, 6938 or (iii) the complement of (i) or 
(ii) . The ATCC deposits have, and SEQ ID Nos : 1, 3027, 

4407, 5770, 6938 present, the entire cDNA of MDZ3, MDZ4, 
MDZ7, and MDZ12a and MDZ12b, respectively, including the 5' 
untranslated (UT) region and 3' UT (except for MDZ12b) . 

In a second embodiment, the invention provides an 
isolated nucleic acid comprising (i) the nucleotide sequence 
of SEQ ID Nos: 2, 3028, 4408, 5771, (ii) a degenerate 
variant of the nucleotide sequence of SEQ ID Nos: 2, 3028, 

4408, 5771, or (iii) the complement of (i) or (ii) . SEQ ID 
Nos: 2, 3028, 4408, 5771 present the open reading frame 
(ORF) from SEQ ID Nos: 1, 3027, 4407, 5770, respectively. 

In a third embodiment, the invention provides an 
isolated nucleic acid of no more than about 100 kb, often no 
more than about 75 kb, typically no more than about 50 kb, 
and usefully no more than about 25 kb, even no more than 
about 10 kb, comprising (i) a nucleotide sequence that 
encodes a polypeptide with the amino acid sequence of any of 
SEQ ID Nos: 3, 3029, 4409, 5772, 6939, or 6940, or (ii) the 
complement of a nucleotide sequence that encodes a 
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polypeptide with the amino acid sequence of any of SEQ ID 
Nos: 3, 3029, 4409, 5772, 6939, or 6940. SEQ ID Nos: 3, 
3029, 4409, 5772, 6939, 6940 provide the amino acid sequence 
of MDZ3, MDZ4, MDZ7, MDZ12a, MDZ12bS and MDZ12bL, 
5 respectively. 

In a fourth embodiment, the invention provides an 
isolated nucleic acid of no more than about 100 kb, often no 
more than about 75 kb, typically no more than about 50 kb, 
and usefully no more than about 2 5 kb, even no more than 

10 about 10 kb, having a nucleotide sequence that (i) encodes a 
polypeptide having the sequence of any of SEQ ID NOs : 3, 
3029, 4409, 5772, 6939, or 6940 with conservative amino acid 
substitutions, (ii) encodes a polypeptide having the 
sequence of any of SEQ ID Nos: 3, 3029, 4409, 5772, 6939, or 

15 6940 with moderately conservative amino acid substitutions, 
or (iii) the complement of (i) or (ii) . 

Selected Partial Nucleic Acids 

20 In a second series of nucleic acid embodiments, 

the invention provides isolated nucleic acids that encode 
select portions of MDZ3, MDZ4, MDZ7 and MDZ12, respectively. 

As will be further discussed herein below, these "partial" 
nucleic acids can be used, inter alia, to express specific 

25 portions of MDZ3, MDZ4, MDZ7 and MDZ12, respectively. These 
"partial" nucleic acids can also be used, inter alia, as 
nucleic probes. 

Selected Partial Nucleic Acids of MDZ3 

30 



In a first such embodiment, the invention provides 
an isolated nucleic acid comprising (i) the nucleotide 
sequence of SEQ ID NO: 4, (ii) the nucleotide sequence of 
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SEQ ID NO; 5, (iii) the nucleotide sequence of SEQ ID NO: 6, 
(iv) a degenerate variant of SEQ ID NO: 6, or (v) the 
complement of any of (i) - (iv) . SEQ ID NO: 4 presents that 
portion of MDZ3 not present in known ESTs, with SEQ ID NO: 5 
5 representing the 5' UT portion of SEQ ID NO: 4 and SEQ ID 

NO: 6 representing the coding region of SEQ ID NO: 4. Often, 
the isolated nucleic acids of this embodiment are no more 
than about 100 kb in length, often no more than about 75 kb 
in length, or 50 kb, or even 25 kb in length, and can be no 

10 more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence 
that encodes SEQ ID NO: 7 or (ii) the complement of a 

15 nucleotide sequence that encodes SEQ ID NO: 7, wherein the 
isolated nucleic acid is no more than about 100 kb in 
length, often no more than about 75 kb in length, or 50 kb, 
or even 25 kb in length, and can be no more than about 15 kb 
in length, and frequently no more than about 10 kb in 

20 length. 

SEQ ID NO: 7 is the amino acid sequence encoded by 
SEQ ID NO: 6. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence 

25 that encodes SEQ ID NO: 7 with conservative substitutions, 
(ii) a nucleotide sequence that encodes SEQ ID NO: 7 with 
moderately conservative substititions, or (iii) the 
complement of (i) or (ii) , wherein the isolated nucleic acid 
is no more than about 100 kb in length, often no more than 

30 about 75 kb in length, or 50 kb, or even 25 kb in length, 
and can be no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 
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Selected Partial Nucleic Acids of MDZ4 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising (i) the nucleotide 
5 sequence of SEQ ID NO: 3030, (ii) the nucleotide sequence of 
SEQ ID NO: 3031, (iii) the nucleotide sequence of SEQ ID NO: 
3032, (iv) a degenerate variant of the nucleotide sequence 
of SEQ ID NO: 3032, or (v) the complement of any of (i) - 
(iv) . SEQ ID NO: 3030 presents a portion of MDZ4 not 

10 present in known ESTs, with SEQ ID NO: 3031 representing the 
5' UT portion of SEQ ID NO: 3030 and SEQ ID NO: 6 representing 
the coding region of SEQ ID NO: 3030, wherein the isolated 
nucleic acids of this embodiment are no more than about 100 
kb in length, often no more than about 75 kb in length, or 

15 50 kb, or even 25 kb in length, and can be no more than 

about 15 kb in length, and frequently no more than about 10 
kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence 

20 that encodes SEQ ID NO: 3033 or (ii) the complement of a 
nucleotide sequence that encodes SEQ ID NO: 3033, wherein 
the isolated nucleic acid is no more than about 100 kb in 
length, typically no more than about 75 kb in length, 
frequently no more than about 50 kb in length. SEQ ID NO: 

25 3033 is the amino acid sequence encoded by that portion of 

MDZ4 not present in known EST fragments. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

30 In another embodiment, the invention provides an 

isolated nucleic acid comprising (i) a nucleotide sequence 
that encodes SEQ ID NO: 3033 with conservative 
substititions, (ii) a nucleotide sequence that encodes SEQ 
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ID NO: 3033 with moderately conservative substitions, or 
(iii) the complement of (i) or (ii) , wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, and often no 
5 more than about 50 kb in length. Often, the isolated 

nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 
In another embodiment, the invention provides an isolated 

10 nucleic acid comprising (i) the nucleotide sequence of SEQ 
ID NO: 3034, (ii) the nucleotide sequence of SEQ ID NO: 
3035, (iii) a degenerate variant of the nucleotide sequence 
of SEQ ID NO: 3035, (iv) the nucleotide sequence of SEQ ID 
NO:3036, or (v) the complement of any of (i) - (iv), wherein 

15 the isolated nucleic acid is no more than about 100 kb in 
length, typically no more than about 75 kb in length, and 
often no more than about 50 kb in length. Often, the 
isolated nucleic acids of this embodiment are no more than 
about 25 kb in length, often no more than about 15 kb in 

20 length, and frequently no more than about 10 kb in length. 

SEQ ID NO: 3034 presents a portion of MDZ4 not 
present in known ESTs, with SEQ ID NO: 3035 representing a 
coding region portion of SEQ ID NO: 3034, and with SEQ ID 
NO: 3036 representing the 3' UT portion of SEQ ID NO: 3034. 

25 In another embodiment, the invention provides an 

isolated nucleic acid comprising (i) a nucleotide sequence 
that encodes SEQ ID NO: 3037 or (ii) the complement of a 
nucleotide sequence that encodes SEQ ID NO: 3037, wherein 
the isolated nucleic acid is no more than about 100 kb in 

30 length, typically no more than about 75 kb in length, 

frequently no more than about 50 kb in length. SEQ ID NO: 
3037 is the amino acid sequence encoded by SEQ ID NO: 3035. 
Often, the isolated nucleic acids of this embodiment are no 
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more than about 25 kb in length, often no more than about 15 
kb in length, and frequently no more than about 10 kb in 
length. 

In another embodiment, the invention provides an 
5 isolated nucleic acid comprising (i) a nucleotide sequence 
that encodes SEQ ID NO: 3037 with conservative 
substitutions, (ii) a nucleotide sequence that encodes SEQ 
ID NO: 3037 with moderately conservative substitutions, or 
(iii) the complement of (i) or (ii) , wherein the isolated 

10 nucleic acid is no more than about 100 kb in length, 

typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 

15 frequently no more than about 10 kb in length. 

Selected Partial Nucleic Acids of MDZ7 

In a first such embodiment, the invention provides 
20 an isolated nucleic acid comprising the nucleotide sequence 
of SEQ ID NO: 4410 or SEQ ID NO: 4411, or the complement 
thereof, wherein the isolated nucleic acid of this aspect of 
the invention is no more than about 100 kb in length, often 
no more than about 75 kb in length, more typically no more 
25 than about 50 kb length. Often, the isolated nucleic acids 
of this embodiment are no more than about 25 kb in length, 
often no more than about 15 kb in length, and frequently no 
more than about 10 kb in length. 

SEQ ID Nos: 4410 and 4411 present those portions 
30 of MDZ7 not present in known ESTs. 
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Selected Partial Nucleic Acids of MDZ12 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising (i) the nucleotide 
5 sequence of SEQ ID NO: 5773, (ii) a degenerate variant of 
SEQ ID NO: 5773, or (iii) the complement of (i) or (ii), 
wherein the isolated nucleic acid is no more than about 100 
kb in length, typically no more than about 7 5 kb in length, 
more typically no more than about 50 kb length. Often, the 

10 isolated nucleic acids of this embodiment are no more than 
about 25 kb in length, often no more than about 15 kb in 
length, and frequently no more than about 10 kb in length. 

SEQ ID NO: 5773 encodes a portion of MDZ12a not 
present in known ESTs. 

15 In another embodiment, the invention provides an 

isolated nucleic acid comprising (i) a nucleotide sequence 
that encodes SEQ ID NO: 5774 or (ii) the complement of a 
nucleotide sequence that encodes SEQ ID NO: 5774, wherein 
the isolated nucleic acid is no more than about 100 kb in 

2 0 length, typically no more than about 75 kb in length, 

frequently no more than about 50 kb in length. SEQ ID NO: 
5774 is the amino acid sequence encoded by that portion of 
MDZ12a not found in any EST fragments. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 

25 kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence 
that encodes SEQ ID NO: 5774 with conservative 

30 substititions, (ii) a nucleotide sequence that encodes SEQ 
ID NO: 5774 with moderately conservative substitutions, or 
(iii) the complement of (i) or (ii) , wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
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typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
5 frequently no more than about 10 kb in length. 

In another such embodiment, the invention provides 
an isolated nucleic acid comprising (i) the nucleotide 
sequence of SEQ ID NO: 6941, (ii) a degenerate variant of 
SEQ ID NO: 6941, or (iii) the complement of (i) or (ii) , 

10 wherein the isolated nucleic acid is no more than about 100 
kb in length, typically no more than about 75 kb in length, 
more typically no more than about 50 kb length. SEQ ID NO: 
6941 is the novel exon inserted in the MDZ12b splice 
variant. Often, the isolated nucleic acids of this 

15 embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising (i) a nucleotide sequence 

20 that encodes SEQ ID NO: 6942 or (ii) the complement of a 
nucleotide sequence that encodes SEQ ID NO: 6942, wherein 
the isolated nucleic acid is no more than about 100 kb in 
length, typically no more than about 75 kb in length, 
frequently no more than about 50 kb in length. SEQ ID NO: 

25 6942 is the amino acid sequence encoded by the novel MDZl2b 
exon before the stop codon. Often, the isolated nucleic 
acids of this embodiment are no more than about 25 kb in 
length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

30 In another embodiment, the invention provides an 

isolated nucleic acid comprising (i) a nucleotide sequence 
that encodes SEQ ID NO: 6942 with conservative 
substitutions, (ii) a nucleotide sequence that encodes SEQ 



- 67 - 



ID NO: 6942 with moderately conservative substititions, or 
(iii) the complement of (i) or (ii), wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

Cross-Hybridizing Nucleic Acids 

In another series of nucleic acid embodiments, the 
invention provides isolated nucleic acids that hybridize to 
various of the MDZ3, MDZ4, MDZ7 and MDZ12 nucleic acids of 
the present invention. These cross-hybridizing nucleic 
acids can be used, inter alia, as probes for, and to drive 
expression of, proteins that are related to MDZ3, MDZ4, MDZ7 
and MDZ12 of the present invention as further isoforms, 
homologues, paralogues, or orthologues. 

Cross-Hybridizing Nucleic Acids of MDZ3 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising a sequence that 
hybridizes under high stringency conditions to a probe the 
nucleotide sequence of which consists of at least 17 nt, 18, 
19, 20, 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 4 
or of the complement of SEQ ID NO: 4, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 2 5 
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kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
5 under moderate stringency conditions to a probe the 

nucleotide sequence of which consists of at least 17 nt, 18, 
19, 20, 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 4 
or of the complement of SEQ ID NO: 4, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 

10 typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

15 In a further embodiment, the invention provides an 

isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a hybridization probe 
the nucleotide sequence of which (i) encodes a polypeptide 
having the sequence of SEQ ID NO: 7, (ii) encodes a 

20 polypeptide having the sequence of SEQ ID NO: 7 with 

conservative amino acid substitutions, or (iii) is the 
complement of (i) or (ii) , wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more 
than about 75 kb in length, and often no more than about 50 

25 kb in length. Often, the isolated nucleic acids of this 

embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

3 0 Cross-Hybridizing Nucleic Acids of MDZ4 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising a sequence that 
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hybridizes under high stringency conditions to a probe the 
nucleotide sequence of which consists of at least 17 nt, 18, 
19, 20, 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 
3030 or of the complement of SEQ ID NO: 3030, wherein the 
5 isolated nucleic acid is no more than about 100 kb in 

length, typically no more than about 75 kb in length, and 
often no more than about 50 kb in length. Often, the 
isolated nucleic acids of this embodiment are no more than 
about 25 kb in length, often no more than about 15 kb in 

10 length, and frequently no more than about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under moderate stringency conditions to a probe the 
nucleotide sequence of which consists of at least 17 nt, 18, 

15 19, 20, 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 
3030 or of the complement of SEQ ID NO: 3030, wherein the 
isolated nucleic acid is no more than about 100 kb in 
length, typically no more than about 75 kb in length, and 
often no more than about 50 kb in length. Often, the 

20 isolated nucleic acids of this embodiment are no more than 
about 25 kb in length, often no more than about 15 kb in 
length, and frequently no more than about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 

25 under high stringency conditions to a hybridization probe 
the nucleotide sequence of which (i) encodes a polypeptide 
having the sequence of SEQ ID NO: 3033, (ii) encodes a 
polypeptide having the sequence of SEQ ID NO: 3033 with 
conservative amino acid substitutions, or (iii) is the 

30 complement of (i) or (ii) , wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more 
than about 75 kb in length, and often no more than about 50 
kb in length. Often, the isolated nucleic acids of this 
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embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

In yet another embodiment, the invention provides 
5 an isolated nucleic acid comprising a sequence that 

hybridizes under high stringency conditions to a probe the 
nucleotide sequence of which consists of at least 17 nt, 18, 
19, 20, 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 
3034 or of the complement of SEQ ID NO: 3034, wherein the 

10 isolated nucleic acid is no more than about 100 kb in 

length, typically no more than about 75 kb in length, and 
often no more than about 50 kb in length. Often, the 
isolated nucleic acids of this embodiment are no more than 
about 25 kb in length, often no more than about 15 kb in 

15 length, and frequently no more than about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under moderate stringency conditions to a probe the 
nucleotide sequence of which consists of at least 17 nt, 18, 

20 19, 20, 21, 22, 23, 24, 25, 30, 40, or 50 nt of SEQ ID NO: 
3034 or of the complement of SEQ ID NO: 3034, wherein the 
isolated nucleic acid is no more than about 100 kb in 
length, typically no more than about 75 kb in length, and 
often no more than about 50 kb in length. Often, the 

25 isolated nucleic acids of this embodiment are no more than 
about 25 kb in length, often no more than about 15 kb in 
length, and frequently no more than about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 

30 under high stringency conditions to a hybridization probe 
the nucleotide sequence of which (i) encodes a polypeptide 
having the sequence of SEQ ID NO: 3037, (ii) encodes a 
polypeptide having the sequence of SEQ ID NO: 3037 with 
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conservative amino acid substitutions, or (iii) is the 
complement of (i) or (ii), wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more 
than about 75 kb in length, and often no more than about 50 
5 kb in length. Often, the isolated nucleic acids of this 

embodiment are no more than about 2 5 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

10 Cross-Hybridizing Nucleic Acids of MDZ7 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising a sequence that 
hybridizes under high stringency conditions to a probe the 

15 nucleotide sequence of which consists of SEQ ID NO: 4410 or 
the complement of SEQ ID NO: 4410, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 

2 0 nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 

25 under moderate stringency conditions to a probe the 

nucleotide sequence of which consists of SEQ ID NO: 4410 or 
the complement of SEQ ID NO: 4410, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, and often no 

30 more than about 50 kb in length. Often, the isolated 

nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 
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In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a probe the nucleotide 
sequence of which consists of SEQ ID NO: 4411 or the 
5 complement of SEQ ID NO: 4411, wherein the isolated nucleic 
acid is no more than about 100 kb in length, typically no 
more than about 75 kb in length, and often no more than 
about 50 kb in length. Often, the isolated nucleic acids of 
this embodiment are no more than about 25 kb in length, 

10 often no more than about 15 kb in length, and frequently no 
more than about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under moderate stringency conditions to a probe the 

15 nucleotide sequence of which consists of SEQ ID NO: 4411 or 
the complement of SEQ ID NO: 4411, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 

20 nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 



25 



Cross-Hybridizing Nucleic Acids of MDZ 1 2 



In a first such embodiment, the invention provides 
an isolated nucleic acid comprising a sequence that 
hybridizes under high stringency conditions to a probe the 
nucleotide sequence of which consists of SEQ ID NO: 5773 or 
30 the complement of SEQ ID NO: 5773, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 
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nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In a further embodiment, the invention provides an 
5 isolated nucleic acid comprising a sequence that hybridizes 
under moderate stringency conditions to a probe the 
nucleotide sequence of which consists of SEQ ID NO: 5773 or 
the complement of SEQ ID NO: 5773, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 

10 typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

15 In a further embodiment, the invention provides an 

isolated nucleic acid comprising a sequence that hybridizes 
under high stringency conditions to a hybridization probe 
the nucleotide sequence of which (i) encodes a polypeptide 
having the sequence of SEQ ID NO: 5774, (ii) encodes a 

20 polypeptide having the sequence of SEQ ID NO: 5774 with 
conservative amino acid substitutions, or (iii) is the 
complement of (i) or (ii) , wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more 
than about 75 kb in length, and often no more than about 50 

25 kb in length. Often, the isolated nucleic acids of this 

embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

In another such embodiment, the invention provides 

30 an isolated nucleic acid comprising a sequence that 

hybridizes under high stringency conditions to a probe the 
nucleotide sequence of which consists of SEQ ID NO: 6941 or 
the complement of SEQ ID NO: 6941, wherein the isolated 
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nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 
5 kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 
under moderate stringency conditions to a probe the 

10 nucleotide sequence of which consists of SEQ ID NO: 6941 or 
the complement of SEQ ID NO: 6941, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, and often no 
more than about 50 kb in length. Often, the isolated 

15 nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising a sequence that hybridizes 

20 under high stringency conditions to a hybridization probe 
the nucleotide sequence of which (i) encodes a polypeptide 
having the sequence of SEQ ID NO: 6942, (ii) encodes a 
polypeptide having the sequence of SEQ ID NO: 6942 with 
conservative amino acid substitutions, or (iii) is the 

25 complement of (i) or (ii) , wherein the isolated nucleic acid 
is no more than about 100 kb in length, typically no more 
than about 75 kb in length, and often no more than about 50 
kb in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 

30 more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 
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Particularly Useful Nucleic Acids 

Particularly Useful Nucleic Acids of MDZ3 

5 Particularly useful among the above-described MDZ3 

nucleic acids are those that are expressed, or the 
complement of which are expressed, in brain, testis, heart 
and bone marrow. 

Also particularly useful among the above-described 

10 MDZ3 nucleic acids are those that encode, or the complement 
of which encode, a polypeptide having at least one C2H2 
(Kruppel family) zinc finger, and especially those that 
encode 7 C2H2 zinc fingers in tandem, those that encode a 
SCAN domain, those that encode a KRAB domain, and those that 

15 include all of a SCAN domain, KRAB domain, and 7 zinc 
fingers . 

Also particularly useful are those encode, or the 
complement of which encode, a polypeptide having sequence- 
specific nucleic acid binding regulatory activity, and that 
20 participates in protein-protein interactions with other 
transcription modulators. 

Particularly Useful Nucleic Acids of MDZ4 

25 Particularly useful among the above-described MDZ4 

nucleic acids are those that are expressed, or the 
complement of which are expressed, in bone marrow, brain, 
heart, hela, adult liver, fetal liver, lung, placenta and 
prostate . 

30 Other particularly useful embodiments of the MDZ4 

nucleic acids above-described are those that encode, or the 
complement of which encode, a polypeptide that has at least 
one C2H2 (Kruppel family) zinc finger, and especially those 
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that encode 5 C2H2 zinc fingers in tandem, those that encode 
a SCAN domain, and those that include all of a SCAN domain 
and 5 zinc fingers. 

Also particularly useful among the above-described 
5 MDZ4 nucleic acids are those that encode, or the complement 
of which encode, a polypeptide having sequence-specific 
nucleic acid binding regulatory activity, and that 
participates in protein-protein interactions with other 
transcription modulators. 

10 

Particularly Useful Nucleic Acids of MDZ7 

Particularly useful among the above-described MDZ7 
nucleic acids are those that are expressed, or the 

15 complement of which are expressed, in testes, preferably at 
a level greater than that in kidney, liver, lung, brain or 
heart, typically at a level at least two-fold that in 
kidney, liver, lung, brain or heart, often at least three- 
fold, four-fold, or even five-fold that in kidney, liver, 

20 lung, brain or heart. 

Also particularly useful among the above-described 
MDZ7 nucleic acids are those that encode, or the complement 
of which encode, a polypeptide having at least one C2H2 
(Kruppel family) zinc finger, especially those having a 

25 plurality of zinc fingers in tandem, particularly those 
having 7 zinc fingers in tandem. 

Also particularly useful among the above-described 
MDZ7 nucleic acids are those that encode, or the complement 
of which encode, a polypeptide having sequence-specific 

30 nucleic acid binding regulatory activity, and that functions 
in sequence-specific modulation of gene expression. 



- 77 - 



Particularly Useful Nucleic Acids of MDZ12 

Particularly useful among the above-described 
MDZ12 nucleic acids are those that are expressed, or the 
5 complement of which are expressed, in brain, heart, kidney, 
placenta, skeletal muscle, testis, Hela cells, bone marrow 
and liver. 

Also particularly useful among the above-described 
nucleic acids are those that encode, or the complement of 

10 which encode, a polypeptide having a C2H2 (Kruppel family) 
zinc finger, particularly those having a plurality of such 
zinc fingers in tandem, especially those having at least 5, 
often at least 12, zinc fingers in tandem. Also 
particularly useful among the above-described nucleic acids 

15 are those that encode, or the complement of which encode, a 
polypeptide having KRAB-B domain, especially those having 
both a KRAB domain and at least one, preferably a plurality, 
especially at least 10, often at least 12, zinc finger 
domains . 

20 Particularly useful nucleic acids are those that 

encode, or the complement of which encode, polypeptides that 
act as sequence-specific transcription regulators, and that 
interaction with other transcriptional modulators by 
protein-protein interactions . 

25 

Nucleic Acid Fragments 

In another series of nucleic acid embodiments, the 
invention provides fragments of various of the isolated 
30 nucleic acids of the present invention which prove useful, 
inter alia, as nucleic acid probes, as amplification 
primers, and to direct expression or synthesis of epitopic 
or immunogenic protein fragments. 
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Nucleic Acid Fragments of MDZ3 

In a first embodiment, the invention provides an 
isolated nucleic acid comprising at least 17 nucleotides, 18 
5 nucleotides, 20 nucleotides, 24 nucleotides, or 25 

nucleotides of (i) SEQ ID NO: 4, (ii) a degenerate variant 
of SEQ ID NO: 6, or (iii) the complement of (i) or (ii), 
wherein the isolated nucleic acid is no more than about 100 
kb in length, typically no more than about 75 kb in length, 
10 more typically no more than about 50 kb in length. Often, 
the isolated nucleic acids of this embodiment are no more 
than about 25 kb in length, often no more than about 15 kb 
in length, and frequently no more than about 10 kb in 
length. 

15 The invention also provides an isolated nucleic 

acid comprising (i) a nucleotide sequence that encodes a 
peptide of at least 8 contiguous amino acids of SEQ ID NO: 
1, or (ii) the complement of a nucleotide sequence that 
encodes a peptide of at least 8 contiguous amino acids of 

20 SEQ ID NO: 7, wherein the isolated nucleic acid is no more 

than about 100 kb in length, typically no more than about 75 
kb in length, more typically no more than about 50 kb in 
length. Often, the isolated nucleic acids of this 
embodiment are no more than about 2 5 kb in length, often no 

25 more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic 
acid comprising (i) a nucleotide sequence that encodes a 
peptide of at least 15 contiguous amino acids of SEQ ID NO: 

30 7, or (ii) the complement of a nucleotide sequence that 

encodes a peptide of at least 15 contiguous amino acids of 
SEQ ID NO: 7, wherein the isolated nucleic acid is no more 
than about 100 kb in length, typically no more than about 75 
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kb in length, more typically no more than about 50 kb in 
length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic 
acid comprising a nucleotide sequence that encodes (i) a 
polypeptide having the sequence of at least 8 contiguous 
amino acids of SEQ ID NO: 7 with conservative amino acid 
substitutions, (ii) a polypeptide having the sequence of at 
least 15 contiguous amino acids of SEQ ID NO: 7 with 
conservative amino acid substitutions, (iii) a polypeptide 
having the sequence of at least 8 contiguous amino acids of 
SEQ ID NO: 7 with moderately conservative substitutions, (iv) 
15 a polypeptide having the sequence of at last 15 congiuous 
amino acids of SEQ ID NO: 7 with moderately conservative 
substitutions, or (v) the complement of any of (i) - (iv) , 
wherein the isolated nucleic acid is no more than about 100 
kb in length, typically no more than about 75 kb in length, 
more typically no more than about 50 kb in length. Often, 
the isolated nucleic acids of this embodiment are no more 
than about 25 kb in length, often no more than about 15 kb 
in length, and frequently no more than about 10 kb in 
length. 



20 



25 



Nucleic Acid Fragments of MDZ4 



In a first embodiment, the invention provides an 
isolated nucleic acid comprising at least 17 nucleotides, 18 
30 nucleotides, 20 nucleotides, 24 nucleotides, or 25 

nucleotides of (i) SEQ ID NO: 3030, (ii) a degenerate 
variant of SEQ ID NO: 3032, or (iii) the complement of (i) 
or (ii), wherein the isolated nucleic acid is no more than 
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about 100 kb in length, typically no more than about 75 kb 
in length, more typically no more than about 50 kb in 
length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
5 more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic 
acid comprising (i) a nucleotide sequence that encodes a 
peptide of at least 8 contiguous amino acids of SEQ ID NO: 

10 3033, (ii) a nucleotide sequence that encodes a peptide of 
at least 15 contiguous amino acids of SEQ ID NO: 3033, or 
(iii) the complement of (i) or (ii), wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, more typically 

15 no more than about 50 kb in length. Often, the isolated 

nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

The invention also provides an isolated nucleic 

20 acid comprising a nucleotide sequence that (i) encodes a 
polypeptide having the sequence of at least 8 contiguous 
amino acids of SEQ ID NO: 3033 with conserative amino acid 
substitutions, (ii) encodes a polypeptide having the 
sequence of at least 8 contiguous amino acids of SEQ ID NO: 

25 3033 with moderately conservative amino acid substitutions, 
(iii) encodes a polypeptide having the sequence of at least 
15 contiguous amino acids of SEQ ID NO: 3033 with 
conservative amino acid substitutions, (iv) encodes a 
polypeptide having the sequence of at last 15 contiguous 

30 amino acids of sEQ ID NO: 3033 with moderately conservative 
amino acid substitutions, or (v) the complement of any one 
of (i) - (iv) , wherein the isolated nucleic acid is no more 
than about 100 kb in length, typically no more than about 75 
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kb in length, more typically no more than about 50 kb in 
length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
5 about 10 kb in length. 

In a further embodiment, the invention provides an 
isolated nucleic acid comprising at least 17 nucleotides, 18 
nucleotides, 20 nucleotides, 24 nucleotides, or 25 
nucleotides of (i) SEQ ID NO: 3034, (ii) a degenerate 

10 variant of SEQ ID NO: 3035, or (iii) the complement of (i) 
or (ii), wherein the isolated nucleic acid is no more than 
about 100 kb in length, typically no more than about 75 kb 
in length, more typically no more than about 50 kb in 
length. Often, the isolated nucleic acids of this 

15 embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic 
acid comprising (i) a nucleotide sequence that encodes a 

20 peptide of at least 8 contiguous amino acids of SEQ ID NO: 
3037, or (ii) the complement of a nucleotide sequence that 
encodes a peptide of at least 8 contiguous amino acids of 
SEQ ID NO: 3037, wherein the isolated nucleic acid is no 
more than about 100 kb in length, typically no more than 

25 about 75 kb in length, more typically no more than about 50 
kb in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

30 The invention also provides an isolated nucleic 

acid comprising a nucleotide sequence that (i) encodes a 
polypeptide having the sequence of at least 8 contiguous 
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amino acids of SEQ ID NO: 3037 having conservative amino 
acid substitutions, (ii) 

encodes a polypeptide having the sequence of at least 8 
contiguous amino acids of SEQ ID NO: 3037 having moderately 
5 conservative amino acid substitutions, (iii) encodes a 

polypeptide having the sequence of at least 15 contiguous 
amino acids of SEQ ID NO: 3037 having conservative amino 
acid substitutions, (iv) 

encodes a polypeptide having the sequence of at least 15 
10 contiguous amino acids of SEQ ID NO: 3037 having moderately 
conservative amino acid substitutions, or (v) is the 
complement of any of (i) - (iv) , wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, more typically 
15 no more than about 50 kb in length. Often, the isolated 

nucleic acids of this embodiment are no more than about 25 
kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

2 0 Nucleic Acid Fragments of MDZ7 

In a first embodiment of this aspect of the 
invention, the invention provides an isolated nucleic acid 
comprising at least 17 nucleotides, 18 nucleotides, 20 

25 nucleotides, 24 nucleotides, or 25 nucleotides of (i) SEQ ID 
NO: 4410, or (ii) the complement thereof, wherein the 
isolated nucleic acid is no more than about 100 kb in 
length, typically no more than about 75 kb in length, more 
typically no more than about 50 kb in length. Often, the 

30 isolated nucleic acids of this embodiment are no more than 
about 2 5 kb in length, often no more than about 15 kb in 
length, and frequently no more than about 10 kb in length. 
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In a second embodiment , the invention provides an 
isolated nucleic acid comprising at least 17 nucleotides, 18 
nucleotides, 20 nucleotides, 24 nucleotides, or 25 
nucleotides of (i) SEQ ID NO: 4411, or (ii) the complement 
5 of thereof, wherein the isolated nucleic acid is no more 

than about 100 kb in length, typically no more than about 75 
kb in length, more typically no more than about 50 kb in 
length. Often, the isolated nucleic acids of this 
embodiment are no more than about 2 5 kb in length, often no 
10 more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

Nucleic Acid Fragments of MDZ12 

15 In a first embodiment, the invention provides an 

isolated nucleic acid comprising at least 17 nucleotides, 18 
nucleotides, 20 nucleotides, 24 nucleotides, or 25 
nucleotides of (i) SEQ ID NO: 5773, (ii) a degenerate 
variant of SEQ ID NO: 5773, or (iii) the complement of (i) 

20 or (ii) , wherein the isolated nucleic acid is no more than 
about 100 kb in length, typically no more than about 75 kb 
in length, more typically no more than about 50 kb in 
length. Often, the isolated nucleic acids of this 
embodiment are no more than about 2 5 kb in length, often no 

25 more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic 
acid comprising (i) a nucleotide sequence that encodes a 
peptide of at least 8 contiguous amino acids of SEQ ID NO: 

30 5774, or (ii) the complement of a nucleotide sequence that 
encodes a peptide of at least 8 contiguous amino acids of 
SEQ ID NO: 5774, wherein the isolated nucleic acid is no 
more than about 100 kb in length, typically no more than 
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about 75 kb in length, more typically no more than about 50 
kb in length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
5 about 10 kb in length. 

The invention also provides an isolated nucleic 
acid comprising (i) a nucleotide sequence that encodes a 
peptide of at least 15 contiguous amino acids of SEQ ID NO: 
5774, or (ii) the complement of a nucleotide sequence that 

10 encodes a peptide of at least 15 contiguous amino acids of 
SEQ ID NO: 5774, wherein the isolated nucleic acid is no 
more than about 100 kb in length, typically no more than 
about 75 kb in length, more typically no more than about 50 
kb in length. Often, the isolated nucleic acids of this 

15 embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic 
acid comprising a nucleotide sequence that (i) encodes a 

20 polypeptide having the sequence of at least 8 contiguous 

amino acids of SEQ ID NO: 5774 with conservative amino acid 
substitutions, or (ii) encodes a polypeptide having the 
sequence of at least 15 contiguous amino acids of SEQ ID 
NO: 5774 having conservative amino acid substitutions, (iii) 

25 encodes a polypeptide having the sequence of at least 8 
contiguous amino acids of SEQ ID NO: 5774 with moderately 
conservative amino acid substitutions, (iv) encodes a 
polypeptide having the sequence of at least 15 contiguous 
amino acids of SEQ ID NO: 5774 having moderately conservative 

30 amino acid substitutions, or (v) the complement of any one 
of (i) - (iv) , wherein the isolated nucleic acid is no more 
than about 100 kb in length, typically no more than about 75 
kb in length, more typically no more than about 50 kb in 
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length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length* 
5 In another embodiment, the invention provides an 

isolated nucleic acid comprising at least 17 nucleotides, 18 
nucleotides, 20 nucleotides, 24 nucleotides, or 25 
nucleotides of (i) SEQ ID NO: 6941, (ii) a degenerate 
variant of SEQ ID NO: 6941, or (iii) the complement of (i) 

10 or (ii) , wherein the isolated nucleic acid is no more than 
about 100 kb in length, typically no more than about 75 kb 
in length, more typically no more than about 50 kb in 
length. Often, the isolated nucleic acids of this 
embodiment are no more than about 25 kb in length, often no 

15 more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic 
acid comprising (i) a nucleotide sequence that encodes a 
peptide of at least 8 contiguous amino acids of SEQ ID NO: 

20 6942, or (ii) the complement of a nucleotide sequence that 
encodes a peptide of at least 8 contiguous amino acids of 
SEQ ID NO: 6942, wherein the isolated nucleic acid is no 
more than about 100 kb in length, typically no more than 
about 75 kb in length, more typically no more than about 50 

25 kb in length. Often, the isolated nucleic acids of this 

embodiment are no more than about 2 5 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

The invention also provides an isolated nucleic 

30 acid comprising a nucleotide sequence that (i) encodes a 
polypeptide having the sequence of at least 8 contiguous 
amino acids of SEQ ID NO: 6942 with conservative amino acid 
substitutions, (ii) encodes a polypeptide having the 
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sequence of at least 15 contiguous amino acids of SEQ ID NO; 
6942 with conservative amino acid substitutions, (iii) 
encodes a polypeptide having the sequence of at least 8 
contiguous amino acids of SEQ ID NO: 6942 with moderately 
5 conservative amino acid substitutions f (iv) encodes a 

polypeptide having the sequence of at least 15 contiguous 
amino acids of SEQ ID NO: 6942 with moderately conservative 
amino acid substitutions, or (v) the complement of any one 
of (i) - (iv) . 

10 

Single Exon Probes 



The invention further provides genome-derived 
single exon probes having portions of no more than one exon 

15 of the above-described genes. As further described in 

commonly owned and copending U.S. patent application serial 
no. 09/632,366, filed August 3, 2000 ("Methods and Apparatus 
for High Throughput Detection and Characterization of 
alternatively Spliced Genes"), the disclosure of which is 

20 incorporated herein by reference in its entirety, such 

single exon probes have particular utility in identifying 
and characterizing splice variants. In particular, such 
single exon probes are useful for identifying and 
discriminating the expression of distinct isoforms of genes. 

25 

Single Exon Probes of MDZ3 



The invention further provides genome-derived 
single exon probes having portions of no more than one exon 
30 of the MDZ3 gene. 

In a first embodiment, the invention provides an 
isolated nucleic acid comprising a nucleotide sequence of no 
more than one portion of SEQ ID NOs : 8 - 15 or the 
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complement of SEQ ID NOs: 8 - 15, wherein the portion 
comprises at least 17 contiguous nucleotides, 18 contiguous 
nucleotides, 20 contiguous nucleotides, 24 contiguous 
nucleotides, 25 contiguous nucleotides, or 50 contiguous 
5 nucleotides of any one of SEQ ID Nos: 8 - 15, or their 
complement. In a further embodiment, the exonic portion 
comprises the entirety of the referenced SEQ ID NO: or its 
complement . 

In other embodiments, the invention provides 
10 isolated single exon probes having the nucleotide sequence 
of any one of SEQ ID NOs: 16-23. 

Single Exon Probes of MDZ4 

15 The invention further provides genome-derived 

single exon probes having portions of no more than one exon 
of the MDZ4 gene. 

In a first embodiment, the invention provides an 
isolated nucleic acid comprising a nucleotide sequence of no 

20 more than one portion of SEQ ID NOs: 3038 - 3041 or the 

complement of SEQ ID NOs: 3038 - 3041, wherein the portion 
comprises at least 17 contiguous nucleotides, 18 contiguous 
nucleotides, 20 contiguous nucleotides, 24 contiguous 
nucleotides, 25 contiguous nucleotides, or 50 contiguous 

25 nucleotides of any one of SEQ ID NOs: 3038 - 3041, or their 
complement. In a further embodiment, the exonic portion 
comprises the entirety of the referenced SEQ ID NO: or its 
complement . 

In other embodiments, the invention provides 
30 isolated single exon probes having the nucleotide sequence 
of any one of SEQ ID NOs: 3042 - 3045. 
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Single Exon Probes of MDZ7 

The invention further provides genome-derived 
single exon probes having portions of no more than one exon 
5 of the MDZ7 gene. 

In a first embodiment, the invention provides an 
isolated nucleic acid comprising a nucleotide sequence of no 
more than one portion of SEQ ID NOs : 4412 - 4415 or the 
complement of SEQ ID NOs: 4412 - 4415, wherein the portion 
10 comprises at least 17 contiguous nucleotides, 18 contiguous 
nucleotides, 20 contiguous nucleotides, 24 contiguous 
nucleotides, 25 contiguous nucleotides, or 50 contiguous 
nucleotides of any one of SEQ ID NOs: 4412 - 4415, or their 
complement. In a further embodiment, the exonic portion 
15 comprises the entirety of the referenced SEQ ID NO: or its 
complement . 

In other embodiments, the invention provides 
isolated single exon probes having the nucleotide sequence 
of any one of SEQ ID NOs: 4416 - 4419. 

20 

Single Exon Probes of MDZ12 

The invention further provides genome-derived 
single exon probes having portions of no more than one exon 

25 of the MDZ12 gene. 

In a first embodiment, the invention provides an 
isolated nucleic acid comprising a nucleotide sequence of no 
more than one portion of SEQ ID NOs: 5775 - 5778, 6941 or 
the complement of SEQ ID NOs: 5775 - 5778, 6941 wherein the 

30 portion comprises at least 17 contiguous nucleotides, 18 
contiguous nucleotides, 20 contiguous nucleotides, 24 
contiguous nucleotides, 25 contiguous nucleotides, or 50 
contiguous nucleotides of any one of SEQ ID NOs: 5775 - 
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5718, 6941, or their complement. In a further embodiment, 
the exonic portion comprises the entirety of the referenced 
SEQ ID NO: or its complement. 

In other embodiments, the invention provides 
5 isolated single exon probes having the nucleotide sequence 
of any one of SEQ ID NOs : 5779 -5782 and 6941. 

Transcription Control Nucleic Acids 

10 Transcription Control Nucleic Acids of MDZ3 

In another aspect, the present invention provides 
genome-derived isolated nucleic acids that include nucleic 
acid sequence elements that control transcription of the 

15 MDZ3 gene. These nucleic acids can be used, inter alia, to 
drive expression of heterologous coding regions in 
recombinant constructs, thus conferring upon such 
heterologous coding regions the expression pattern of the 
native MDZ3 gene. These nucleic acids can also be used, 

20 conversely, to target heterologous transcription control 

elements to the MDZ3 genomic locus, altering the expression 
pattern of the MDZ3 gene itself. 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising the nucleotide sequence 

25 of SEQ ID NO: 24 or its complement, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, more typically 
no more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 

30 kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising at least 17, 18, 20, 24, or 
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25 nucleotides of the sequence of SEQ ID NO: 24 or its 
complement, wherein the isolated nucleic acid is no more 
than about 100 kb in length, typically no more than about 75 
kb in length, more typically no more than about 50 kb in 
5 length. Often, the isolated nucleic acids of this 

embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

10 Transcription Control Nucleic Acids of MDZ4 

In another aspect, the present invention provides 
genome-derived isolated nucleic acids that include nucleic 
acid sequence elements that control transcription of the 

15 MDZ4 gene. These nucleic acids can be used, inter alia, to 
drive expression of heterologous coding regions in 
recombinant constructs, thus conferring upon such 
heterologous coding regions the expression pattern of the 
native MDZ4 gene. These nucleic acids can also be used, 

20 conversely, to target heterologous transcription control 

elements to the MDZ4 genomic locus, altering the expression 
pattern of the MDZ4 gene itself. 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising the nucleotide sequence 

25 of SEQ ID NO: 3046 or its complement, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, more typically 
no more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 

30 kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising at least 17, 18, 20, 24, or 
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25 nucleotides of the sequence of SEQ ID NO: 3046 or its 
complement, wherein the isolated nucleic acid is no more 
than about 100 kb in length, typically no more than about 75 
kb in length, more typically no more than about 50 kb in 
5 length. Often, the isolated nucleic acids of this 

embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

10 Transcription Control Nucleic Acids of MDZ7 

In another aspect, the present invention provides 
genome-derived isolated nucleic acids that include nucleic 
acid sequence elements that control transcription of the 

15 MDZ7 gene. These nucleic acids can be used, inter alia, to 
drive expression of heterologous coding regions in 
recombinant constructs, thus conferring upon such 
heterologous coding regions the expression pattern of the 
native MDZ7 gene. These nucleic acids can also be used, 

20 conversely, to target heterologous transcription control 

elements to the MDZ7 genomic locus, altering the expression 
pattern of the MDZ7 gene itself. 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising the nucleotide sequence 

25 of SEQ ID NO: 4420 or its complement, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, more typically 
no more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 

30 kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising at least 17, 18, 20, 24, or 
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25 nucleotides of the sequence of SEQ ID NO: 4420 or its 
complement , wherein the isolated nucleic acid is no more 
than about 100 kb in length, typically no more than about 75 
kb in length, more typically no more than about 50 kb in 
5 length. Often, the isolated nucleic acids of this 

embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

10 Transcription Control Nucleic Acids of MDZ12 

In another aspect, the present invention provides 
genome-derived isolated nucleic acids that include nucleic 
acid sequence elements that control transcription of the 

15 MDZ12 gene. These nucleic acids can be used, inter alia, to 
drive expression of heterologous coding regions in 
recombinant constructs, thus conferring upon such 
heterologous coding regions the expression pattern of the 
native MDZ12 gene. These nucleic acids can also be used, 

20 conversely, to target heterologous transcription control 

elements to the MDZ12 genomic locus, altering the expression 
pattern of the MDZ12 gene itself. 

In a first such embodiment, the invention provides 
an isolated nucleic acid comprising the nucleotide sequence 

25 of SEQ ID NO: 5783 or its complement, wherein the isolated 
nucleic acid is no more than about 100 kb in length, 
typically no more than about 75 kb in length, more typically 
no more than about 50 kb in length. Often, the isolated 
nucleic acids of this embodiment are no more than about 25 

30 kb in length, often no more than about 15 kb in length, and 
frequently no more than about 10 kb in length. 

In another embodiment, the invention provides an 
isolated nucleic acid comprising at least 17, 18, 20, 24, or 
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25 nucleotides of the sequence of SEQ ID NO: 5783 or its 
complement, wherein the isolated nucleic acid is no more 
than about 100 kb in length, typically no more than about 75 
kb in length, more typically no more than about 50 kb in 
5 length. Often, the isolated nucleic acids of this 

embodiment are no more than about 25 kb in length, often no 
more than about 15 kb in length, and frequently no more than 
about 10 kb in length. 

10 VECTORS AND HOST CELLS 

In another aspect, the present invention provides 
vectors that comprise one or more of the isolated nucleic 
acids of the present invention, and host cells in which such 

15 vectors have been introduced. 

The vectors can be used, Inter alia, for 
propagating the nucleic acids of the present invention in 
host cells (cloning vectors) , for shuttling the nucleic 
acids of the present invention between host cells derived 

20 from disparate organisms (shuttle vectors), for inserting 
the nucleic acids of the present invention into host cell 
chromosomes (insertion vectors) , for expressing sense or 
antisense RNA transcripts of the nucleic acids of the 
present invention in vitro or within a host cell, and for 

25 expressing polypeptides encoded by the nucleic acids of the 
present invention, alone or as fusions to heterologous 
polypeptides. Vectors of the present invention will often 
be suitable for several such uses. 

Vectors are by now well-known in the art, and are 

30 described, inter alia, in Jones et al . (eds.), Vectors : 
Cloning Applications : Essential Techniques (Essential 
Techniques Series), John Wiley & Son Ltd 1998 (ISBN: 
047196266X) ; Jones et al . (eds.), Vectors : Expression 
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Systems : Essential Techniques (Essential Techniques 
Series), John Wiley & Son Ltd, 1998 (ISBN: 0471962678) ; 
Gacesa et al. r Vectors: Essential Data , John Wiley & Sons, 
1995 (ISBN: 0471948411); Cid-Arregui (eds.), Viral Vectors: 
5 Basic Science and Gene Therapy , Eaton Publishing Co., 2000 
(ISBN: 188129935X); Sambrook et al . , Molecular Cloning: A 
Laboratory Manual (3 rd ed.)/ Cold Spring Harbor Laboratory 
Press, 2001 (ISBN: 0879695773); Ausubel et al. (eels.)/ Short 
Protocols in Molecular Biology: A Compendium of Methods from 

10 Current Protocols in Molecular Biology (4 th ed.), John Wiley 
& Sons, 1999 (ISBN: 047132938X) , the disclosures of which 
are incorporated herein by reference in their entireties. 
Furthermore, an enormous variety of vectors are available 
commercially. Use of existing vectors and modifications 

15 thereof being well within the skill in the art, only basic 
features need be described here. 

Typically, vectors are derived from virus, 
plasmid, prokaryotic or eukaryotic chromosomal elements, or 
some combination thereof, and include at least one origin of 

20 replication, at least one site for insertion of heterologous 
nucleic acid, typically in the form of a polylinker with 
multiple, tightly clustered, single cutting restriction 
sites, and at least one selectable marker, although some 
integrative vectors will lack an origin that is functional 

25 in the host to be chromosomally modified, and some vectors 
will lack selectable markers. Vectors of the present 
invention will further include at least one nucleic acid of 
the present invention inserted into the vector in at least 
one location. 

30 Where present, the origin of replication and 

selectable markers are chosen based upon the desired host 
cell or host cells; the host cells, in turn, are selected 
based upon the desired application. 
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For example, prokaryotic cells, typically E. coli, 
are typically chosen for cloning. In such case, vector 
replication is predicated on the replication strategies of 
colif orm-inf ecting phage — such as phage lambda, M13, T7, T3 
5 and PI — or on the replication origin of autonomously 

replicating episomes, notably the ColEl plasmid and later 
derivatives, including pBR322 and the pUC series plasmids. 
Where E. coli is used as host, selectable markers are, 
analogously, chosen for selectivity in gram negative 

10 bacteria: e.g., typical markers confer resistance to 
antibiotics, such as ampicillin, tetracycline, 
chloramphenicol, kanamycin, streptomycin, zeocin; 
auxotrophic markers can also be used. 

As another example, yeast cells, typically S. 

15 cerevisiae, are chosen, inter alia, for eukaryotic genetic 
studies, due to the ease of targeting genetic changes by 
homologous recombination and to the ready ability to 
complement genetic defects using recombinantly expressed 
proteins, for identification of interacting protein 

20 components, e.g. through use of a two-hybrid system, and for 
protein expression. Vectors of the present invention for 
use in yeast will typically, but not invariably, contain an 
origin of replication suitable for use in yeast and a 
selectable marker that is functional in yeast. 

25 Integrative Yip vectors do not replicate 

autonomously, but integrate, typically in single copy, into 
the yeast genome at low frequencies and thus replicate as 
part of the host cell chromosome; these vectors lack an 
origin of replication that is functional in yeast, although 

30 they typically have at least one origin of replication 

suitable for propagation of the vector in bacterial cells. 
YEp vectors, in contrast, replicate episomally and 
autonomously due to presence of the yeast 2 micron plasmid 
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origin (2 ]im ori) . The YCp yeast centromere plasmid vectors 
are autonomously replicating vectors containing centromere 
sequences, CEN, and autonomously replicating sequences, ARS; 
the ARS sequences are believed to correspond to the natural 
5 replication origins of yeast chromosomes. YACs are based on 
yeast linear plasmids, denoted YLp, containing homologous or 
heterologous DNA sequences that function as telomeres (TEL) 
in vivo, as well as containing yeast ARS (origins of 
replication) and CEN (centromeres) segments. 

10 Selectable markers in yeast vectors include a 

variety of auxotrophic markers, the most common of which are 
(in Saccharomyces cerevisiae) URA3, HIS 3, LEU2, TRP1 and 
LYS2, which complement specific auxotrophic mutations, such 
as ura3-52, his3-Dl, Ieu2-Dl, trpl-Dl and lys2-201. The 

15 URA3 and LYS2 yeast genes further permit negative selection 
based on specific inhibitors, 5-f luoro-orotic acid (FOA) and 
a-aminoadipic acid (cxAA) , respectively, that prevent growth 
of the prototrophic strains but allows growth of the ura3 
and lys2 mutants, respectively. Other selectable markers 

20 confer resistance to, e.g., zeocin. 

As yet another example, insect cells are often 
chosen for high efficiency protein expression. Where the 
host cells are from Spodoptera frugiperda — e.g., Sf9 and 

TM 

Sf21 cell lines, and expresSF cells (Protein Sciences 
25 Corp., Meriden, CT, USA) — the vector replicative strategy 
is typically based upon the baculovirus life cycle. 
Typically, baculovirus transfer vectors are used to replace 
the wild-type AcMNPV polyhedrin gene with a heterologous 
gene of interest. Sequences that flank the polyhedrin gene 
30 in the wild-type genome are positioned 5 T and 3 T of the 
expression cassette on the transfer vectors. Following 
cotransfection with AcMNPV DNA, a homologous recombination 
event occurs between these sequences resulting in a 
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recombinant virus carrying the gene of interest and the 
polyhedrin or plO promoter. Selection can be based upon 
visual screening for lacZ fusion activity. 

As yet another example, mammalian cells are often 
5 chosen for expression of proteins intended as pharmaceutical 
agents, and are also chosen as host cells for screening of 
potential agonist and antagonists of a protein or a 
physiological pathway. 

Where mammalian cells are chosen as host cells, 

10 vectors intended for autonomous extrachromosomal replication 
will typically include a viral origin, such as the SV40 
origin (for replication in cell lines expressing the large 
T-antigen, such as C0S1 and C0S7 cells) , the papillomavirus 
origin, or the EBV origin for long term episomal replication 

15 (for use, e.g., in 293-EBNA cells, which constitutively 
express the EBV EBNA-1 gene product and adenovirus ElA) . 
Vectors intended for integration, and thus replication as 
part of the mammalian chromosome, can, but need not, include 
an origin of replication functional in mammalian cells, such 

20 as the SV40 origin. Vectors based upon viruses, such as 
adenovirus, adeno-associated virus, vaccinia virus, and 
various mammalian retroviruses, will typically replicate 
according to the viral replicative strategy. 

Selectable markers for use in mammalian cells 

25 include resistance to neomycin (G418) , blasticidin, 

hygromycin and to zeocin, and selection based upon the 
purine salvage pathway using HAT medium. 

Plant cells can also be used for expression, with 
the vector replicon typically derived from a plant virus 

30 (e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, 
TMV) and selectable markers chosen for suitability in 
plants . 
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For propagation of nucleic acids of the present 
invention that are larger than can readily be accomodated in 
vectors derived from plasmids or virus , the invention 
further provides artificial chromosomes — BACs, YACs, 
5 HACs and PACs - that comprise MDZ3, MDZ4, MDZ7 and MDZ12 
nucleic acids, respectively, often genomic nucleic acids. 

The BAC system is based on the well-characterized 
E. coli F-factor, a low copy plasmid that exists in a 
supercoiled circular form in host cells. The structural 

10 features of the F-factor allow stable maintenance of 

individual human DNA clones as well as easy manipulation of 
the cloned DNA. See Shizuya et al., Kelo J. Med. 
50(l):26-30 (2001); Shizuya et al . , Proc. Natl. Acad. Sci. 
USA 89(18) :8794-7 (1992). 

15 YACs are based on yeast linear plasmids, denoted 

YLp, containing homologous or heterologous DNA sequences 
that function as telomeres (TEL) in vivo, as well as 
containing yeast ARS (origins of replication) and CEN 
(centromeres) segments . 

20 HACs are human artifical chromosomes. Kuroiwa et 

al., Nature Biotechnol. 18 (10) : 1086-90 (2000); Henning et 
al., Proc. Natl. Acad. Sci. USA 96 (2) : 592-7 (1999); 
Harrington et al . , Nature Genet. 15(4): 345-55 (1997). In 
one version, long synthetic arrays of alpha satellite DNA 

25 are combined with telomeric DNA and genomic DNA to generate 
linear microchromosomes that are mitotically and 
cytogenetically stable in the absence of selection. 

PACs are Pl-derived artificial chromosomes. 
Sternberg, Proc. Natl. Acad. Sci. USA 87 (1) : 103-7 (1990); 

30 Sternberg et al., New Biol. 2(2): 151-62 (1990); Pierce et 
al., Proc. Natl Acad. Sci. USA 89 ( 6) : 2056-60 (1992). 

Vectors of the present invention will also often 
include elements that permit in vitro transcription of RNA 
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from the inserted heterologous nucleic acid. Such vectors 
typically include a phage promoter, such as that from T7, 
T3, or SP6, flanking the nucleic acid insert. Often two 
different such promoters flank the inserted nucleic acid, 
5 permitting separate in vitro production of both sense and 
antisense strands . 

Expression vectors of the present invention — that 
is, those vectors that will drive expression of polypeptides 
from the inserted heterologous nucleic acid — will often 

10 include a variety of other genetic elements operatively 
linked to the protein-encoding heterologous nucleic acid 
insert, typically genetic elements that drive transcription, 
such as promoters and enhancer elements, those that 
facilitate RNA processing, such as transcription termination 

15 and/or polyadenylation signals, and those that facilitate 
translation, such as ribosomal consensus sequences. 

For example, vectors for expressing proteins of 
the present invention in prokaryotic cells, typically E. 
coli f will include a promoter, often a phage promoter, such 

2 0 as phage lambda pL promoter, the trc promoter, a hybrid 

derived from the trp and lac promoters, the bacteriophage T7 
promoter (in E. coli cells engineered to express the T7 
polymerase) , or the araBAD operon. Often, such prokaryotic 
expression vectors will further include transcription 

25 terminators, such as the aspA terminator, and elements that 
facilitate translation, such as a consensus ribosome binding 
site and translation termination codon, Schomer et al. f 
Proc. Natl. Acad. Sci. USA 83:8506-8510 (1986). 

As another example, vectors for expressing 

30 proteins of the present invention in yeast cells, typically 
S. cerevisiae f will include a yeast promoter, such as the 
CYC1 promoter, the GAL1 promoter, ADH1 promoter, or the GPD 
promoter, and will typically have elements that facilitate 
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transcription termination, such as the transcription 
termination signals from the CYC1 or ADH1 gene. 

As another example, vectors for expressing 
proteins of the present invention in mammalian cells will 
5 include a promoter active in mammalian cells. Such 

promoters are often drawn from mammalian viruses — such as 
the enhancer-promoter sequences from the immediate early 
gene of the human cytomegalovirus (CMV) , the 

enhancer-promoter sequences from the Rous sarcoma virus long 

10 terminal repeat (RSV LTR) , and the enhancer-promoter from 
SV4 0. Often, expression is enhanced by incorporation of 
polyadenylation sites, such as the late SV40 polyadenylation 
site and the polyadenylation signal and transcription 
termination sequences from the bovine growth hormone (BGH) 

15 gene, and ribosome binding sites. Furthermore, vectors can 
include introns, such as intron II of rabbit f3-globin gene 
and the SV4 0 splice elements. 

Vector-drive protein expression can be 
constitutive or inducible., 

20 Inducible vectors include either naturally 

inducible promoters, such as the trc promoter, which is 
regulated by the lac operon, and the pL promoter, which is 
regulated by tryptophan, the MMTV-LTR promoter, which is 
inducible by dexamethasone, or can contain synthetic 

25 promoters and/or additional elements that confer inducible 
control on adjacent promoters. Examples of inducible 
synthetic promoters are the hybrid Plac/ara-1 promoter and 
the PLtetO-1 promoter. The PltetO-1 promoter takes 
advantage of the high expression levels from the PL promoter 

30 of phage lambda, but replaces the lambda repressor sites 
with two copies of operator 2 of the TnlO tetracycline 
resistance operon, causing this promoter to be tightly 
repressed by the Tet repressor protein and induced in 
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response to tetracycline (Tc) and Tc derivatives such as 
anhydrotetracycline . 

As another example of inducible elements, hormone 
response elements, such as the glucocorticoid response 
element (GRE) and the estrogen response element (ERE) , can 
confer hormone inducibility where vectors are used for 
expression in cells having the respective hormone receptors. 

To reduce background levels of expression, elements 
responsive to ecdysone, an insect hormone, can be used 
instead, with coexpression of the ecdysone receptor. 

Expression vectors can be designed to fuse the 
expressed polypeptide to small protein tags that facilitate 
purification and/or visualization. 

For example, proteins of the present invention can 
be expressed with a polyhistidine tag that facilitates 
purification of the fusion protein by immobilized metal 
affinity chromatography, for example using NiNTA resin 

TM 

(Qiagen Inc., Valencia, CA, USA) or TALON resin (cobalt 
immobilized affinity chromatography medium, Clontech Labs, 
Palo Alto, CA, USA) . As another example, the fusion protein 
can include a chitin-binding tag and self-excising intein, 
permitting chitin-based purification with self-removal of 

TM 

the fused tag (IMPACT system, New England Biolabs, Inc., 
Beverley, MA, USA) . Alternatively, the fusion protein can 
include a calmodulin-binding peptide tag, permitting 
purification by calmodulin affinity resin (Stratagene, La 
Jolla, CA, USA) , or a specifically excisable fragment of the 
biotin carboxylase carrier protein, permitting purification 
of in vivo biotinylated protein using an avidin resin and 
subsequent tag removal (Promega, Madison, WI, USA) . As 
another useful alternative, the proteins of the present 
invention can be expressed as a fusion to glutathione-S- 
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transferase, the affinity and specificity of binding to 
glutathione permitting purification using glutathione 
affinity resins, such as Glutathione-Superf low Resin 
(Clontech Laboratories, Palo Alto, CA, USA) , with subsequent 
5 elution with free glutathione. 

Other tags include, for example, the Xpress 
epitope, detectable by anti-Xpress antibody (Invitrogen, 
Carlsbad, CA, USA) , a myc tag, detectable by anti-myc tag 
antibody, the V5 epitope, detectable by anti-V5 antibody 

10 (Invitrogen, Carlsbad, CA, USA), FLAG® epitope, detectable 
by anti-FLAG® antibody (Stratagene, La Jolla, CA, USA) , and 
the HA epitope. 

For secretion of expressed proteins, vectors can 
include appropriate sequences that encode secretion signals, 

15 such as leader peptides. For example, the pSecTag2 vectors 
(Invitrogen, Carlsbad, CA, USA) are 5.2 kb mammalian 
expression vectors that carry the secretion signal from the 
V-J2-C region of the mouse Ig kappa-chain for efficient 
secretion of recombinant proteins from a variety of 

20 mammalian cell lines. 

Expression vectors can also be designed to fuse 
proteins encoded by the heterologous nucleic acid insert to 
polypeptides larger than purification and/or identification 
tags. Useful protein fusions include those that permit 

25 display of the encoded protein on the surface of a phage or 
cell, fusions to intrinsically fluorescent proteins, such as 
those that have a green fluorescent protein (GFP) -like 
chromophore, fusions to the IgG Fc region, and fusions for 
use in two hybrid systems . 

30 Vectors for phage display fuse the encoded 

polypeptide to, e.g., the gene III protein (pill) or gene 
VIII protein (pVIII) for display on the surface of 
filamentous phage, such as M13. See Barbas et al . , Phage 
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Display: A Laboratory Manual , Cold Spring Harbor Laboratory 
Press (2001) (ISBN 0-87969-546-3); Kay et al . (eds.), Phage 
Display of Peptides and Proteins: A Laboratory Manual , San 
Diego: Academic Press, Inc., 1996; Abelson et al. (eds.), 
5 Combinatorial Chemistry , Methods in Enzymology vol. 2 67, 
Academic Press (May 1996) . 

Vectors for yeast display, e.g. the pYDl yeast 
display vector (Invitrogen, Carlsbad, CA, USA) , use the 
ot-agglutinin yeast adhesion receptor to display recombinant 

10 protein on the surface of S. cerevisiae. Vectors for 

mammalian display, e.g., the pDisplay™ vector (Invitrogen, 
Carlsbad, CA, USA) , target recombinant proteins using an 
N-terminal cell surface targeting signal and a C-terminal 
transmembrane anchoring domain of platelet derived growth 

15 factor receptor. 

A wide variety of vectors now exist that fuse 
proteins encoded by heterologous nucleic acids to the 
chromophore of the substrate-independent, intrinsically 
fluorescent green fluorescent protein from Aequorea victoria 

20 ("GFP") and its variants. These proteins are intrinsically 
fluorescent: the GFP-like chromophore is entirely encoded by 
its amino acid sequence and can fluoresce without 
requirement for cof actor or substrate. 

Structurally, the GFP-like chromophore comprises 

25 an 11-stranded |3-barrel ((3-can) with a central a-helix, the 
central a-helix having a conjugated n-resonance system that 
includes two aromatic ring systems and the bridge between 
them. The n-resonance system is created by autocatalytic 
cyclization among amino acids; cyclization proceeds through 

30 an imidazolinone intermediate, with subsequent 

dehydrogenation by molecular oxygen at the Ca-cp bond of a 
participating tyrosine . 
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The GFP-like chromophore can be selected from GFP- 
like chromophore s found in naturally occurring proteins, 
such as A. victoria GFP (GenBank accession number AAA27721), 
Renilla reniformis GFP, FP583 (GenBank accession no. 
5 AF168419) (DsRed) , FP593 (AF272711), FP483 (AF168420) , FP484 
(AF168424), FP595 (AF246709) , FP486 (AF168421), FP538 
(AF168423), and FP506 (AF168422), and need include only so 
much of the native protein as is needed to retain the 
chromophore ' s intrinsic fluorescence. Methods for 
10 determining the minimal domain required for fluorescence are 
known in the art. Li et al . , "Deletions of the Aequorea 
victoria Green Fluorescent Protein Define the Minimal Domain 
Required for Fluorescence," J. Biol. Chem. 272:28545-28549 
(1997) . 

15 Alternatively, the GFP-like chromophore can be 

selected from GFP-like chromophores modified from those 
found in nature. Typically, such modifications are made to 
improve recombinant production in heterologous expression 
systems (with or without change in protein sequence) , to 

20 alter the excitation and/or emission spectra of the native 
protein, to facilitate purification, to facilitate or as a 
consequence of cloning, or are a fortuitous consequence of 
research investigation. 

The methods for engineering such modified GFP-like 

25 chromophores and testing them for fluorescence activity, 

both alone and as part of protein fusions, are well-known in 
the art. Early results of these efforts are reviewed in 
Heim et al. f Curr. Biol. 6:178-182 (1996), incorporated 
herein by reference in its entirety; a more recent review, 

30 with tabulation of useful mutations, is found in Palm et 
al., "Spectral Variants of Green Fluorescent Protein," in 
Green Fluorescent Proteins , Conn (ed. ) , Methods Enzymol. 
vol. 302, pp. 378 - 394 (1999), incorporated herein by 
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reference in its entirety. A variety of such modified 
chromophores are now commercially available and can readily 
be used in the fusion proteins of the present invention. 

For example, EGFP ("enhanced GFP" ) , Cormack et 
5 al. f Gene 173:33-38 (1996); U.S. Pat. Nos. 6,090,919 and 
5,804,387, is a red-shifted, human codon-optimized variant 
of GFP that has been engineered for brighter fluorescence, 
higher expression in mammalian cells, and for an excitation 
spectrum optimized for use in flow cytometers . EGFP can 

10 usefully contribute a GFP-like chromophore to the fusion 
proteins of the present invention. A variety of EGFP 
vectors, both plasmid and viral, are available commercially 
(Clontech Labs, Palo Alto, CA, USA) , including vectors for 
bacterial expression, vectors for N-terminal protein fusion 

15 expression, vectors for expression of C-terminal protein 
fusions, and for bicistronic expression. 

Toward the other end of the emission spectrum, 
EBFP ("enhanced blue fluorescent protein") and BFP2 contain 
four amino acid substitutions that shift the emission from 

20 green to blue, enhance the brightness of fluorescence and 

improve solubility of the protein, Heim et al . , Curr. Biol. 
6:178-182 (1996); Cormack et al . , Gene 173:33-38 (1996). 
EBFP is optimized for expression in mammalian cells whereas 
BFP2, which retains the original jellyfish codons, can be 

25 expressed in bacteria; as is further discussed below, the 
host cell of production does not affect the utility of the 
resulting fusion protein. The GFP-like chromophores from 
EBFP and BFP2 can usefully be included in the fusion 
proteins of the present invention, and vectors containing 

30 these blue-shifted variants are available from Clontech Labs 
(Palo Alto, CA, USA) . 

Analogously, EYFP ("enhanced yellow fluorescent 
protein"), also available from Clontech Labs, contains four 
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amino acid substitutions, different from EBFP, Ormo et al., 
Science 273:1392-1395 (1996), that shift the emission from 
green to yellowish-green. Citrine, an improved yellow 
fluorescent protein mutant, is described in Heikal et al., 
5 Proc. Natl. Acad. Sci. USA 97 : 11996-12001 (2000). ECFP 

("enhanced cyan fluorescent protein") (Clontech Labs, Palo 
Alto, CA, USA) contains six amino acid substitutions, one of 
which shifts the emission spectrum from green to cyan. Heim 
et al., Curr. Biol. 6:178-182 (1996); Miyawaki et al. f 

10 Nature 388:882-887 (1997). The GFP-like chromophore of each 
of these GFP variants can usefully be included in the fusion 
proteins of the present invention. 

The GFP-like chromophore can also be drawn from 
other modified GFPs, including those described in U.S. Pat. 

15 Nos. 6,124,128; 6,096,865; 6,090,919; 6,066,476; 6,054,321; 
6,027,881; 5,968,750; 5,874,304; 5,804,387; 5,777,079; 
5,741,668; and 5,625,048, the disclosures of which are 
incorporated herein by reference in their entireties. See 
also Conn (ed.), Green Fluorescent Protein , Methods in 

20 Enzymol. Vol. 302, pp 378-394 (1999), incorporated herein by 
reference in its entirety. A variety of such modified 
chromophores are now commercially available and can readily 
be used in the fusion proteins of the present invention. 

Fusions to the IgG Fc region increase serum half 

25 life of protein pharmaceutical products through interaction 
with the FcRn receptor (also denominated the FcRp receptor 
and the Brambell receptor, FcRb) , further described in 
international patent application nos. WO 97/43316, WO 
97/34631, WO 96/32478, WO 96/18412. 

30 For long-term, high-yield recombinant production 

of the proteins, protein fusions, and protein fragments of 
the present invention, stable expression is particularly 
useful . 
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Stable expression is readily achieved by 
integration into the host cell genome of vectors having 
selectable markers, followed by selection for integrants. 

For example/ the pUB6/V5-His A, B, and C vectors 
5 (Invitrogen, Carlsbad, CA, USA) are designed for high-level 
stable expression of heterologous proteins in a wide range 
of mammalian tissue types and cell lines. pUB6/V5-His uses 
the promoter/enhancer sequence from the human ubiquitin C 
gene to drive expression of recombinant proteins: expression 
10 levels in 293, CHO, and NIH3T3 cells are comparable to 

levels from the CMV and human EF-la promoters. The bsd gene 
permits rapid selection of stably transfected mammalian 
cells with the potent antibiotic blasticidin. 

Replication incompetent retroviral vectors, 
15 typically derived from Moloney murine leukemia virus, prove 
particularly useful for creating stable transf ectants having 
integrated provirus. The highly efficient transduction 
machinery of retroviruses, coupled with the availability of 

TM 

a variety of packaging cell lines — such as RetroPack PT 

TM 

20 67, EcoPack2 -293, AmphoPack-293, GP2-293 cell lines (all 

available from Clontech Laboratories, Palo Alto, CA, USA) — 
allow a wide host range to be infected with high efficiency; 
varying the multiplicity of infection readily adjusts the 
copy number of the integrated provirus. Retroviral vectors 

25 are available with a variety of selectable markers, such as 
resistance to neomycin, hygromycin, and puromycin, 
permitting ready selection of stable integrants. 

The present invention further includes host cells 
30 comprising the vectors of the present invention, either 

present episomally within the cell or integrated, in whole 
or in part, into the host cell chromosome. 
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Among other considerations, some of which are 
described above, a host cell strain may be chosen for its 
ability to process the expressed protein in the desired 
fashion. Such post-translational modifications of the 
5 polypeptide include, but are not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation, 
and acylation, and it is an aspect of the present invention 
to provide MDZ3, MDZ4, MDZ7 and MDZ12 proteins, 
respectively, with such post-translational modifications. 

10 As noted earlier, host cells can be prokaryotic or 

eukaryotic. Representative examples of appropriate host 
cells include, but are not limited to, bacterial cells, such 
as E. coli f Caulobacter crescentus, Streptomyces species, 
and Salmonella typhimurium; yeast cells, such as 

15 Saccharomyces cerevisiae , Schizosaccharomyces pombe f Pichia 
pastoris, Pichia methanol ica; insect cell lines, such as 
those from Spodoptera frugiperda — e.g., Sf9 and Sf21 cell 

TM 

lines, and expresSF cells (Protein Sciences Corp., 
Meriden, CT, USA) — Drosophila S2 cells, and Trichoplusia ni 

20 High Five® Cells (Invitrogen, Carlsbad, CA, USA); and 

mammalian cells. Typical mammalian cells include C0S1 and 
C0S7 cells, Chinese hamster ovary (CHO) cells, NIH 3T3 
cells, 293 cells, HEPG2 cells, HeLa cells, L cells, murine 
ES cell lines (e.g., from strains 129/SV, C57/BL6, DBA- 1 , 

25 129/SVJ) , K562, Jurkat cells, and BW5147. Other mammalian 
cell lines are well known and readily available from the 
American Type Culture Collection (ATCC) (Manassas, VA, USA) 
and the National Institute of General medical Sciences 
(NIGMS) Human Genetic Cell Repository at the Coriell Cell 

30 Repositories (Camden, NJ, USA) . 

Methods for introducing the vectors and nucleic 
acids of the present invention into the host cells are well 
known in the art; the choice of technique will depend 
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primarily upon the specific vector to be introduced and the 
host cell chosen. 

For example, phage lambda vectors will typically 
be packaged using a packaging extract (e.g., Gigapack® 
5 packaging extract, Stratagene, La Jolla, CA, USA) , and the 
packaged virus used to infect E. coli. Plasmid vectors will 
typically be introduced into chemically competent or 
electrocompetent bacterial cells. 

E. coli cells can be rendered chemically competent 

10 by treatment, e.g., with CaCl 2 , or a solution of Mg 2+ , Mn 2+ , 
Ca 2+ , Rb + or K + , dimethyl sulfoxide, dithiothreitol, and 
hexamine cobalt (III), Hanahan, J. Mol. Biol. 166 (4) : 557-80 
(1983), and vectors introduced by heat shock. A wide 
variety of chemically competent strains are also available 

15 commercially (e.g., Epicurian Coli® XLIO-Gold® 

Ultracompetent Cells (Stratagene, La Jolla, CA, USA) ; DH5a 
competent cells (Clontech Laboratories, Palo Alto, CA, USA) ; 
TOP10 Chemically Competent E. coli Kit (Invitrogen, 
Carlsbad, CA, USA) ) . 

20 Bacterial cells can be rendered electrocompetent — 

that is, competent to take up exogenous DNA by 
electroporation — by various pre-pulse treatments; vectors 
are introduced by electroporation followed by subsequent 
outgrowth in selected media. An extensive series of 

25 protocols is provided online in Electroprotocols (BioRad, 
Richmond, CA, USA) (http://www.bio- 
rad. com/Li f eScience/pdf /New__Gene__Pulser .pdf ) . 

Vectors can be introduced into yeast cells by 
spheroplasting, treatment with lithium salts, 

30 electroporation, or protoplast fusion. 

Spheroplasts are prepared by the action of 
hydrolytic enzymes — a snail-gut extract, usually denoted 
Glusulase, or Zymolyase, an enzyme from Arthrobacter 
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luteus — to remove portions of the cell wall in the presence 
of osmotic stabilizers, typically 1 M sorbitol. DNA is 
added to the spheroplasts, and the mixture is 
co-precipitated with a solution of polyethylene glycol (PEG) 
5 and Ca 2+ . Subsequently, the cells are resuspended in a 
solution of sorbitol, mixed with molten agar and then 
layered on the surface of a selective plate containing 
sorbitol. For lithium-mediated transformation, yeast cells 
are treated with lithium acetate, which apparently 

10 permeabilizes the cell wall, DNA is added and the cells are 
co-precipitated with PEG. The cells are exposed to a brief 
heat shock, washed free of PEG and lithium acetate, and 
subsequently spread on plates containing ordinary selective 
medium. Increased frequencies of transformation are 

15 obtained by using specially-prepared single-stranded carrier 
DNA and certain organic solvents. Schiestl et al., Curr. 
Genet, 16 (5-6) : 339-46 (1989). For electroporation, 
freshly-grown yeast cultures are typically washed, suspended 
in an osmotic protectant, such as sorbitol, mixed with DNA, 

20 and the cell suspension pulsed in an electroporation 

device. Subsequently, the cells are spread on the surface 
of plates containing selective media. Becker et al . , 
Methods Enzymol . 194:182-7 (1991). The efficiency of 
transformation by electroporation can be increased over 

25 100-fold by using PEG, single-stranded carrier DNA and cells 
that are in late log-phase of growth. Larger constructs, 
such as YACs, can be introduced by protoplast fusion. 

Mammalian and insect cells can be directly 
infected by packaged viral vectors, or transfected by 

30 chemical or electrical means. 

For chemical transf ection, DNA can be 
coprecipitated with CaP0 4 or introduced using liposomal and 
nonliposomal lipid-based agents. Commercial kits are 
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available for CaP0 4 transfection (CalPhos™ Mammalian 
Transfection Kit, Clontech Laboratories, Palo Alto, CA, 
USA) , and lipid-mediated transfection can be practiced using 
commercial reagents, such as LIPOFECTAMINE™ 2000, 
5 LIPOFECTAMINE™ Reagent, CELLFECTIN® Reagent, and 

LIPOFECTIN® Reagent (Invitrogen, Carlsbad, CA, USA), DOTAP 
Liposomal Transfection Reagent, FuGENE 6, X-tremeGENE Q2, 
DOSPER, (Roche Molecular Biochemicals , Indianapolis, IN 

™ ® ® 

USA), Effectene , PolyFect , Superfect (Qiagen, Inc., 

10 Valencia, CA, USA) . Protocols for electroporating mammalian 
cells can be found online in Electroprotocols (Bio-Rad, 
Richmond, CA, USA) (http://www.bio- 
rad. com/Lif eScience/pdf /New_Gene_Pulser .pdf ) . 
See also, Norton et al . (eds.), Gene Transfer Methods: 

15 Introducing DNA into Living Cells and Organisms , 

BioTechniques Books, Eaton Publishing Co. (2000) (ISBN 
1-881299-34-1) , incorporated herein by reference in its 
entirety. 

Other transfection techniques include transfection 
20 by particle embardment. See, e.g., Cheng et al*, Proc. 
Natl. Acad. Sci. USA 90 (10) : 4455-9 (1993); Yang et al . , 
Proc. Natl. Acad. Sci. USA 87 (24 ): 9568-72 (1990). 

PROTEINS 

25 

In another aspect, the present invention provides 
MDZ3, MDZ4 , MDZ7 and MDZ12 proteins, respectively, various 
fragments thereof suitable for use as antigens (e.g., for 
epitope mapping) and for use as immunogens (e.g., for 
30 raising antibodies or as vaccines), fusions of MDZ3, MDZ4 , 
MDZ7 and MDZ12 polypeptides and fragments to heterologous 
polypeptides, and conjugates of the proteins, fragments, and 
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fusions of the present invention to other moieties (e.g., to 
carrier proteins, to f luorophores ) . 

FIGS. 3, 6, 9, 12 and 13 present the predicted 
amino acid sequences encoded by the MDZ3, MDZ4, MDZ7, MDZ12a 
5 and MDZ12b (S and L) cDNA clones, respectively. The amino 
acid sequences are further presented, respectively, in SEQ 
ID Nos: 3, 3029, 4409, 5772, 6939 and 6940. 

Unless otherwise indicated, amino acid sequences 
of the proteins of the present invention were determined as 

10 a predicted translation from a nucleic acid sequence. 

Accordingly, any amino acid sequence presented herein may 
contain errors due to errors in the nucleic acid sequence, 
as described in detail above. Furthermore, single 
nucleotide polymorphisms (SNPs) occur frequently in 

15 eukaryotic genomes - more than 1.4 million SNPs have already 
identified in the human genome, International Human Genome 
Sequencing Consortium, Nature 409:860 - 921 (2001) - and the 
sequence determined from one individual of a species may 
differ from other allelic forms present within the 

20 population. Small deletions and insertions can often be 
found that do not alter the function of the protein. 

Accordingly, it is an aspect of the present 
invention to provide proteins not only identical in sequence 
to those described with particularity herein, but also to 

25 provide isolated proteins at least about 65% identical in 
sequence to those described with particularity herein, 
typically at least about 70%, 75%, 80%, 85%, or 90% 
identical in sequence to those described with particularity 
herein, usefully at least about 91%, 92%, 93%, 94%, or 95% 

30 identical in sequence to those described with particularity 
herein, usefully at least about 96%, 97%, 98%, or 99% 
identical in sequence to those described with particularity 
herein, and, most conservatively, at least about 99.5%, 
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99.6%, 99.7%, 99.8% and 99.9% identical in sequence to those 
described with particularity herein. These sequence 
variants can be naturally occurring or can result from human 
intervention by way of random or directed mutagenesis. 
5 For purposes herein, percent identity of two amino 

acid sequences is determined using the procedure of Tatiana 
et al . , "Blast 2 sequences - a new tool for comparing 
protein and nucleotide sequences", FEMS Microbiol Lett. 
174:247-250 (1999), which procedure is effectuated by the 
10 computer program BLAST 2 SEQUENCES, available online at 

http: //www.ncbi .nlm.nih. gov/blast/bl2seq/bl2 .html, 

To assess percent identity of amino acid sequences, the 

15 B LAS TP module of BLAST 2 SEQUENCES is used with default 

values of (i) BLOSUM62 matrix, Henikoff et al . , Proc. Natl. 
Acad. Sci USA 89 (22 ): 10915-9 (1992); (ii) open gap 11 and 
extension gap 1 penalties; and (iii) gap x_dropoff 50 expect 
10 word size 3 filter, and both sequences are entered in 

20 their entireties. 

As is well known, amino acid substitutions occur 
frequently among natural allelic variants, with conservative 
substitutions often occasioning only de minimis change in 
protein function. 

25 Accordingly, it is an aspect of the present 

invention to provide proteins not only identical in sequence 
to those described with particularity herein, but also to 
provide isolated proteins having the sequence of MDZ3, MDZ4, 
MDZ7 and MDZ12 proteins, respectively, or portions thereof, 

30 with conservative amino acid substitutions. It is a further 
aspect to provide isolated proteins having the sequence of 
MDZ3, MDZ4, MDZ7 and MDZ12 proteins, respectively, and 
portions thereof, with moderately conservative amino acid 
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substitutions. These conservatively-substituted and 
moderately conservatively-substituted variants can be 
naturally occurring or can result from human intervention. 
Although there are a variety of metrics for 
5 calling conservative amino acid substitutions, based 

primarily on either observed changes among evolutionarily 
related proteins or on predicted chemical similarity/ for 
purposes herein a conservative replacement is any change 
having a positive value in the PAM250 log-likelihood matrix 
10 reproduced herein below {see Gonnet et al . , Science 
256(5062) :1443-5 (1992) ) : 




For purposes herein, a "moderately conservative" replacement 
is any change having a nonnegative value in the PAM250 log- 
likelihood matrix reproduced herein above. 

As is also well known in the art, relatedness of 
proteins can also be characterized using a functional test, 
the ability of the encoding nucleic acids to base-pair to 
one another at defined hybridization stringencies. 

It is, therefore, another aspect of the invention 
to provide isolated proteins not only identical in sequence 
to those described with particularity herein, but also to 
provide isolated proteins ("hybridization related proteins") 
that are encoded by nucleic acids that hybridize under high 
stringency conditions (as defined herein above) to all or to 
a portion of various of the isolated nucleic acids of the 
present invention ("reference nucleic acids"). It is a 
further aspect of the invention to provide isolated proteins 
("hybridization related proteins") that are encoded by 
nucleic acids that hybridize under moderate stringency 
conditions (as defined herein above) to all or to a portion 
of various of the isolated nucleic acids of the present 
invention ("reference nucleic acids"). 

The hybridization related proteins can be 
alternative isoforms, homologues, paralogues, and 
orthologues of the MDZ3, MDZ4, MDZ7 and MDZ12 proteins, 
respectively, of the present invention. Particularly useful 
orthologues are those from other primate species, such as 
chimpanzee, rhesus macaque monkey, baboon, orangutan, and 
gorilla, from rodents, such as rats, mice, guinea pigs; from 
lagomorphs, such as rabbits, and from domestic livestock, 
such as cow, pig, sheep, horse, and goat. 

Relatedness of proteins can also be characterized 
using a second functional test, the ability of a first 
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protein competitively to inhibit the binding of a second 
protein to an antibody. 

It is, therefore, another aspect of the present 
invention to provide isolated proteins not only identical in 
5 sequence to those described with particularity herein, but 
also to provide isolated proteins ("cross-reactive 
proteins") that competitively inhibit the binding of 
antibodies to all or to a portion of various of the isolated 
MDZ3, MDZ4, MDZ7 and MDZ12 proteins, respectively, of the 

10 present invention ("reference proteins"). Such competitive 
inhibition can readily be determined using immunoassays well 
known in the art . 

Among the proteins of the present invention that 
differ in amino acid sequence from those described with 

15 particularity herein — including those that have deletions 
and insertions causing up to 10% non-identity, those having 
conservative or moderately conservative substitutions, 
hybridization related proteins, and cross-reactive 
proteins — those that substantially retain one or more MDZ3, 

20 MDZ4, MDZ7 or MDZ12 activities are particularly useful. As 
described above, those activities include transcription 
regulation and protein-protein interaction. 

Residues that are tolerant of change while 
retaining function can be identified by altering the protein 

25 at known residues using methods known in the art, such as 
alanine scanning mutagenesis, Cunningham et al., Science 
244 (4908) ; 1081-5 (1989); transposon linker scanning 
mutagenesis, Chen et al . , Gene 2 63 ( 1-2 ): 39-4 8 (2001); 
combinations of homolog- and alanine-scanning mutagenesis, 

30 Jin et al . , J. Mol. Biol. 226 (3) : 851-65 (1992); 

combinatorial alanine scanning, Weiss et al . , Proc. Natl. 
Acad. Sci USA 97 ( 16) : 8950-4 (2000), followed by functional 
assay. Transposon linker scanning kits are available 
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commercially (New England Biolabs, Beverly, MA, USA, 
catalog, no. E7-102S; EZ::TN™ In-Frame Linker Insertion 
Kit, catalogue no. EZI04KN, Epicentre Technologies 
Corporation, Madison, WI, USA) . 
5 As further described below, the isolated proteins 

of the present invention can readily be used as specific 
immunogens to raise antibodies that specifically recognize 
MDZ3, MDZ4, MDZ7 or MDZ12 (MDZ12a, MDZ12bS, MDZ12bL) 
proteins, their isoforms, homologues, paralogues, and/or 

10 orthologues. The antibodies, in turn, can be used, inter 
alia, specifically to assay for the MDZ3, MDZ4, MDZ7 or 
MDZ12 proteins of the present invention — e.g. by ELISA for 
detection of protein fluid samples, such as serum, by 
immunohistochemistry or laser scanning cytometry, for 

15 detection of protein in tissue samples, or by flow 

cytometry, for detection of intracellular protein in cell 
suspensions — for specific antibody-mediated isolation 
and/or purification of MDZ3, MDZ4, MDZ7 or MDZ12 proteins, 
as for example by immunoprecipitation, and for use as 

20 specific agonists or antagonists of MDZ3, MDZ4, MDZ7 or 
MDZ12 action. 

The isolated proteins of the present invention are 
also immediately available for use as specific standards in 
assays used to determine the concentration and/or amount 

25 specifically of the MDZ3, MDZ4, MDZ7 or MDZ12 proteins of 
the present invention. As is well known, ELISA kits for 
detection and quantitation of protein analytes typically 
include isolated and purified protein of known concentration 
for use as a measurement standard (e.g., the human 

30 interferon-y OptEIA kit, catalog no. 555142, Pharmingen, San 
Diego, CA, USA includes human recombinant gamma interferon, 
baculovirus produced) . 
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The isolated proteins of the present invention are 
also immediately available for use as specific biomolecule 
capture probes for surface-enhanced laser desorption 
ionization (SELDI) detection of protein-protein 
5 interactions, WO 98/59362; WO 98/59360; WO 98/59361; and 
Merchant et al. f Electrophoresis 21 (6) : 1164-77 (2000), the 
disclosures of which are incorporated herein by reference in 
their entireties. Analogously, the isolated proteins of the 
present invention are also immediately available for use as 

10 specific biomolecule capture probes on B I AC ORE surface 
plasmon resonance probes. See Weinberger et al . , 
Pharmacogenomics 1(4):395-416 (2000); Malmqvist, Biochem. 
Soc. Trans. 27(2):335-40 (1999). 

The isolated proteins of the present invention are 

15 also useful as a therapeutic supplement in patients having a 
specific deficiency in MDZ3, MDZ4, MDZ7 or MDZ12 production, 
respectively. 

In another aspect, the invention also provides 
fragments of various of the proteins of the present 

20 invention. The protein fragments are useful, inter alia, as 
antigenic and immunogenic fragments of MDZ3, MDZ4, MDZ7 or 
MDZ12, respectively. 

By "fragments" of a protein is here intended 
isolated proteins (equally, polypeptides, peptides, 

25 oligopeptides) , however obtained, that have an amino acid 

sequence identical to a portion of the reference amino acid 
sequence, which portion is at least 6 amino acids and less 
than the entirety of the reference nucleic acid. As so 
defined, "fragments" need not be obtained by physical 

30 fragmentation of the reference protein, although such 
provenance is not thereby precluded. 

Fragments of at least 6 contiguous amino acids are 
useful in mapping B cell and T cell epitopes of the 
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reference protein. See, e.g., Geysen et al . , "Use of 
peptide synthesis to probe viral antigens for epitopes to a 
resolution of a single amino acid, " Proc. Natl. Acad. Sci. 
USA 81:3998-4002 (1984) and U.S. Pat. Nos. 4,708,871 and 
5 5,595,915, the disclosures of which are incorporated herein 
by reference in their entireties. Because the fragment need 
not itself be immunogenic, part of an immunodominant 
epitope, nor even recognized by native antibody, to be 
useful in such epitope mapping, all fragments of at least 6 

10 amino acids of the proteins of the present invention have 
utility in such a study. 

Fragments of at least 8 contiguous amino acids, 
often at least 15 contiguous amino acids, have utility as 
immunogens for raising antibodies that recognize the 

15 proteins of the present invention. See, e.g., Lerner, 

"Tapping the immunological repertoire to produce antibodies 
of predetermined specificity," Nature 299:592-596 (1982); 
Shinnick et al., "Synthetic peptide immunogens as vaccines," 
Annu. Rev. Microbiol. 37:425-46 (1983); Sutcliffe et al . , 

20 "Antibodies that react with predetermined sites on 

proteins," Science 219:660-6 (1983), the disclosures of 
which are incorporated herein by reference in their 
entireties. As further described in the above-cited 
references, virtually all 8-mers, conjugated to a carrier, 

25 such as a protein, prove immunogenic — that is, prove 

capable of eliciting antibody for the conjugated peptide; 
accordingly, all fragments of at least 8 amino acids of the 
proteins of the present invention have utility as 
immunogens . 

30 Fragments of at least 8, 9, 10 or 12 contiguous 

amino acids are also useful as competitive inhibitors of 
binding of the entire protein, or a portion thereof, to 
antibodies (as in epitope mapping) , and to natural binding 
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partners, such as subunits in a multimeric complex or to 
receptors or ligands of the subject protein; this 
competitive inhibition permits identification and separation 
of molecules that bind specifically to the protein of 
5 interest, U.S. Pat. Nos. 5,539,084 and 5,783,674, 

incorporated herein by reference in their entireties. 

The protein, or protein fragment, of the present 
invention is thus at least 6 amino acids in length, 
typically at least 8, 9, 10 or 12 amino acids in length, and 

10 often at least 15 amino acids in length. Often, the protein 
or the present invention, or fragment thereof, is at least 
20 amino acids in length, even 25 amino acids, 30 amino 
acids, 35 amino acids, or 50 amino acids or more in length. 
Of course, larger fragments having at least 75 amino acids, 

15 100 amino acids, or even 150 amino acids are also useful, 
and at times preferred. 

The present invention further provides fusions of 
each of the proteins and protein fragments of the present 
invention to heterologous polypeptides. 

20 By fusion is here intended that the protein or 

protein fragment of the present invention is linearly 
contiguous to the heterologous polypeptide in a peptide- 
bonded polymer of amino acids or amino acid analogues; by 
"heterologous polypeptide" is here intended a polypeptide 

25 that does not naturally occur in contiguity with the protein 
or protein fragment of the present invention. As so 
defined, the fusion can consist entirely of a plurality of 
fragments of any one of the MDZ3, MDZ4, MDZ7 or MDZ12 
proteins, respectively, in altered arrangement; in such 

30 case, any of the MDZ3, MDZ4, MDZ7 or MDZ12 fragments can be 
considered heterologous to the other MDZ3, MDZ4, MDZ7 or 
MDZ12 fragments in the fusion protein. More typically, 
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however, the heterologous polypeptide is not drawn from the 
MDZ3, MDZ4, MDZ7 or MDZ12 protein itself. 

The fusion proteins of the present invention will 
include at least one fragment of the protein of the present 
5 invention, which fragment is at least 6, typically at least 
8, often at least 15, and usefully at least 16, 17, 18, 19, 
or 20 amino acids long. The fragment of the protein of the 
present to be included in the fusion can usefully be at 
least 25 amino acids long, at least 50 amino acids long, and 

10 can be at least 75, 100, or even 150 amino acids long. 

Fusions that include the entirety of the proteins of the 
present invention have particular utility. 

The heterologous polypeptide included within the 
fusion protein of the present invention is at least 6 amino 

15 acids in length, often at least 8 amino acids in length, and 
usefully at least 15, 20, and 25 amino acids in length. 
Fusions that include larger polypeptides, such as the IgG Fc 
region, and even entire proteins (such as GFP chromophore- 
containing proteins) , have particular utility. 

20 As described above in the description of vectors 

and expression vectors of the present invention, which 
discussion is incorporated herein by reference in its 
entirety, heterologous polypeptides to be included in the 
fusion proteins of the present invention can usefully 

25 include those designed to facilitate purification and/or 

visualization of recombinantly-expressed proteins* Although 
purification tags can also be incorporated into fusions that 
are chemically synthesized, chemical synthesis typically 
provides sufficient purity that further purification by HPLC 

30 suffices; however, visualization tags as above described 
retain their utility even when the protein is produced by 
chemical synthesis, and when so included render the fusion 
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proteins of the present invention useful as directly 
detectable markers of MDZ3, MDZ4 , MDZ7 or MDZ12 presence. 

As also discussed above, heterologous polypeptides 
to be included in the fusion proteins of the present 
invention can usefully include those that facilitate 
secretion of recombinantly expressed proteins — into the 
periplasmic space or extracellular milieu for prokaryotic 
hosts, into the culture medium for eukaryotic cells — 
through incorporation of secretion signals and/or leader 
sequences . 

Other useful protein fusions of the present 
invention include those that permit use of the protein of 
the present invention as bait in a yeast two-hybrid system. 

See Bartel et al. (eds.), The Yeast Two-Hybrid System , 
Oxford University Press (1997) {ISBN: 0195109384); Zhu et 
al., Yeast Hybrid Technologies , Eaton Publishing, (2000) 
(ISBN 1-881299-15-5); Fields et al., Trends Genet. 
10(8):286-92 (1994); Mendelsohn et al. f Curr. Opin. 
Biotechnol. 5(5): 482-6 (1994); Luban et al., Curr. Opin. 
Biotechnol. 6(l):59-64 (1995); Allen et al . , Trends Biochem. 
Sci. 20(12): 511-6 (1995); Drees, Curr . Opin. Chem. Biol. 
3(1): 64-70 (1999); Topcu et al . , Pharm. Res. 17 ( 9) : 1 04 9-55 
(2000); Fashena et al., Gene 250 ( 1-2 ): 1-14 (2000), the 
disclosures of which are incorporated herein by reference in 
their entireties. Typically, such fusion is to either E. 
coli LexA or yeast GAL4 DNA binding domains. Related bait 
plasmids are available that express the bait fused to a 
nuclear localization signal. 

Other useful protein fusions include those that 
permit display of the encoded protein on the surface of a 
phage or cell, fusions to intrinsically fluorescent 
proteins, such as green fluorescent protein (GFP) , and 
fusions to the IgG Fc region, as described above, which 
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discussion is incorporated here by reference in its 
entirety. 

The proteins and protein fragments of the present 
invention can also usefully be fused to protein toxins, such 
5 as Pseudomonas exotoxin A, diphtheria toxin, shiga toxin A, 
anthrax toxin lethal factor, ricin, in order to effect 
ablation of cells that bind or take up the proteins of the 
present invention. 

The isolated proteins, protein fragments, and 

10 protein fusions of the present invention can be composed of 
natural amino acids linked by native peptide bonds, or can 
contain any or all of nonnatural amino acid analogues, 
nonnative bonds, and post-synthetic (post translational ) 
modifications, either throughout the length of the protein 

15 or localized to one or more portions thereof. 

As is well known in the art, when the isolated 
protein is used, e.g., for epitope mapping, the range of 
such nonnatural analogues, nonnative inter-residue bonds, or 
post-synthesis modifications will be limited to those that 

20 permit binding of the peptide to antibodies. When used as 
an immunogen for the preparation of antibodies in a non- 
human host, such as a mouse, the range of such nonnatural 
analogues, nonnative inter-residue bonds, or post-synthesis 
modifications will be limited to those that do not interfere 

25 with the immunogenicity of the protein. When the isolated 

protein is used as a therapeutic agent, such as a vaccine or 
for replacement therapy, the range of such changes will be 
limited to those that do not confer toxicity upon the 
isolated protein. 

30 Non-natural amino acids can be incorporated during 

solid phase chemical synthesis or by recombinant techniques, 
although the former is typically more common. 
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Solid phase chemical synthesis of peptides is well 
established in the art. Procedures are described, inter 
alia, in Chan et al . (eds.), Fmoc Solid Phase Peptide 
Synthesis: A Practical Approach (Practical Approach Series) , 
5 Oxford Univ. Press (March 2000) (ISBN: 0199637245) ; Jones, 
Amino Acid and Peptide Synthesis (Oxford Chemistry Primers, 
No 7), Oxford Univ. Press (August 1992) (ISBN: 0198556683); 
and Bodanszky, Principles of Peptide Synthesis (Springer 
Laboratory), Springer Verlag (December 1993) (ISBN: 

10 0387564314) , the disclosures of which are incorporated 
herein by reference in their entireties. 

For example, D-enantiomers of natural amino acids 
can readily be incorporated during chemical peptide 
synthesis: peptides assembled from D-amino acids are more 

15 resistant to proteolytic attack; incorporation of D- 
enantiomers can also be used to confer specific three 
dimensional conformations on the peptide. Other amino acid 
analogues commonly added during chemical synthesis include 
ornithine, norleucine, phosphorylated amino acids (typically 

20 phosphoserine, phosphothreonine, phosphotyrosine) , 
L-malonyltyrosine, a non-hydrolyzable analog of 
phosphotyrosine (Kole et al., Biochem. Biophys. Res. Com. 
209:817-821 (1995)), and various halogenated phenylalanine 
derivatives . 

25 Amino acid analogues having detectable labels are 

also usefully incorporated during synthesis to provide a 
labeled polypeptide. 

Biotin, for example (indirectly detectable through 
interaction with avidin, streptavidin, neutravidin, 

30 captavidin, or anti-biotin antibody) , can be added using 
biotinoyl-- (9-f luorenylmethoxycarbonyl) -L-lysine (FMOC 
biocytin) (Molecular Probes, Eugene, OR, USA) . (Biotin can 
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also be added enzymatically by incorporation into a fusion 
protein of a E. coli BirA substrate peptide.) 

The FMOC and tBOC derivatives of dabcyl-L-lysine 
(Molecular Probes, Inc., Eugene, OR, USA) can be used to 
5 incorporate the dabcyl chromophore at selected sites in the 
peptide sequence during synthesis. The aminonaphthalene 
derivative EDANS, the most common fluorophore for pairing 
with the dabcyl quencher in fluorescence resonance energy 
transfer (FRET) systems, can be introduced during automated 

10 synthesis of peptides by using EDANS — FMOC-L-glutamic acid 
or the corresponding tBOC derivative (both from Molecular 
Probes, Inc., Eugene, OR, USA). Tetramethylrhodamine 
fluorophores can be incorporated during automated FMOC 
synthesis of peptides using (FMOC) --TMR-L-lysine (Molecular 

15 Probes, Inc. Eugene, OR, USA) . 

Other useful amino acid analogues that can be 
incorporated during chemical synthesis include aspartic 
acid, glutamic acid, lysine, and tyrosine analogues having 
allyl side-chain protection (Applied Biosystems, Inc., 

20 Foster City, CA, USA) ; the allyl side chain permits 
synthesis of cyclic, branched-chain, sulfonated, 
glycosylated, and phosphorylated peptides. 

A large number of other FMOC-protected non-natural 
amino acid analogues capable of incorporation during 

25 chemical synthesis are available commercially, including, 
e. g. , Fmoc-2-aminobicyclo [2.2. 1 ] heptane-2-carboxylic acid, 
Fmoc-3-endo-aminobicyclo [2.2. 1] heptane-2-endo-carboxylic 
acid, Fmoc-3-exo-aminobicyclo [2.2.1] heptane-2-exo-carboxylic 
acid, Fmoc-3-endo-amino-bicyclo [2.2.1] hept-5-ene-2-endo- 

30 carboxylic acid, Fmoc-3-exo-amino-bicyclo [2.2.1] hept-5-ene- 
2-exo-carboxylic acid, Fmoc-cis-2-amino-l- 
cyclohexanecarboxylic acid, Fmoc-trans-2-amino-l- 
cyclohexanecarboxylic acid, Fmoc- 1- amino- 1- 
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cyclopentanecarboxylic acid, Fmoc-cis-2- amino- 1- 
cyclopentanecarboxylic acid, Fmoc- 1- amino- 1- 
cyclopropanecarboxylic acid, Fmoc-D-2-amino-4- 
(ethylthio) butyric acid, Fmoc-L-2-amino-4- (ethyl thio) butyric 
5 acid, Fmoc-L-buthionine, Fmoc-S-methyl-L-Cysteine, Fmoc-2- 
aminobenzoic acid (anthranillic acid) , Fmoc-3-aminobenzoic 
acid, Fmoc-4-aminobenzoic acid, Fmoc-2-aminobenzophenone-2 1 - 
carboxylic acid, Fmoc-N- (4-aminobenzoyl) -b-alanine, Fmoc-2- 
amino-4, 5-dimethoxybenzoic acid, Fmoc-4-aminohippuric acid, 

10 Fmoc- 2- amino- 3-hydroxybenzoic acid, Fmoc- 2 -amino- 5- 

hydroxybenzoic acid, Fmoc-3-amino-4-hydroxybenzoic acid, 
Fmoc- 4 -amino- 3-hydroxybenzoic acid, Fmoc- 4 -amino- 2- 
hydroxybenzoic acid, Fmoc- 5~amino-2-hydroxyben zoic acid, 
Fmoc-2 -amino- 3-methoxybenzoic acid, Fmoc- 4 -amino- 3- 

15 methoxybenzoic acid, Fmoc-2-amino-3-methylbenzoic acid, 
Fmoc-2 -amino- 5-me thy lbenzoic acid, Fmoc- 2 -amino- 6- 
methylbenzoic acid, Fmoc-3-amino-2-methylbenzoic acid, Fmoc- 

3- amino-4-methylbenzoic acid, Fmoc- 4- amino- 3-me thy lbenzoic 
acid, Fmoc-3-amino-2-naphtoic acid, Fmoc-D, L- 3 -amino- 3- 

20 phenylpropionic acid, Fmoc-L-Methyldopa, Fmoc-2-amino-4 , 6- 
dimethyl-3-pyridinecarboxylic acid, Fmoc-D, L-?-amino-2- 
thiophenacetic acid, Fmoc-4- (carboxymethyl) piperazine, Fmoc- 

4- carboxypiperazine, Fmoc-4- (carboxymethyl) homopiperazine, 
Fmoc-4-phenyl-4-piperidinecarboxylic acid, Fmoc-L-1, 2, 3, 4- 

25 tetrahydronorharman- 3 -carboxylic acid, Fmoc-L-thiazolidine- 
4-carboxylic acid, all available from The Peptide Laboratory 
(Richmond, CA, USA) . 

Non-natural residues can also be added 
biosynthetically by engineering a suppressor tRNA, typically 

30 one that recognizes the UAG stop codon, by chemical 

aminoacylation with the desired unnatural amino acid and. 
Conventional site-directed mutagenesis is used to introduce 
the chosen stop codon UAG at the site of interest in the 



protein gene. When the acylated suppressor tRNA and the 
mutant gene are combined in an in vitro 

transcription/translation system, the unnatural amino acid 
is incorporated in response to the UAG codon to give a 
protein containing that amino acid at the specified 
position. Liu et al . , Proc. Natl Acad. Sci. USA 
96(9):4780-5 (1999); Wang et al . , Science 292 (5516) : 498-500 
(2001) . 

The isolated proteins, protein fragments and 
fusion proteins of the present invention can also include 
nonnative inter-residue bonds, including bonds that lead to 
circular and branched forms. 

The isolated proteins and protein fragments of the 
present invention can also include post-translational and 
post-synthetic modifications, either throughout the length 
of the protein or localized to one or more portions thereof. 

For example, when produced by recombinant 
expression in eukaryotic cells, the isolated proteins, 
fragments, and fusion proteins of the present invention will 
typically include N-linked and/or 0-linked glycosylation, 
the pattern of which will reflect both the availability of 
glycosylation sites on the protein sequence and the identity 
of the host cell. Further modification of glycosylation 
pattern can be performed enzymatically . 

As another example, recombinant polypeptides of 
the invention may also include an initial modified 
methionine residue, in some cases resulting from host- 
mediated processes. 

When the proteins, protein fragments, and protein 
fusions of the present invention are produced by chemical 
synthesis, post-synthetic modification can be performed 
before deprotection and cleavage from the resin or after 
deprotection and cleavage. Modification before deprotection 
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and cleavage of the synthesized protein often allows greater 
control, e.g. by allowing targeting of the modifying moiety 
to the N-terminus of a resin-bound synthetic peptide. 

Useful post-synthetic (and post-translational) 
modifications include conjugation to detectable labels, such 
as f luorophores . 

A wide variety of amine-reactive and thiol- 
reactive fluorophore derivatives have been synthesized that 
react under nondenaturing conditions with N-terminal amino 
groups and epsilon amino groups of lysine residues, on the 
one hand, and with free thiol groups of cysteine residues, 
on the other. 

Kits are available commercially that permit 
conjugation of proteins to a variety of amine-reactive or 
thiol-reactive f luorophores : Molecular Probes, Inc. (Eugene, 
OR, USA), e.g., offers kits for conjugating proteins to 
Alexa Fluor 350, Alexa Fluor 430, Fluorescein-EX, Alexa 
Fluor 488, Oregon Green 488, Alexa Fluor 532, Alexa Fluor 
546, Alexa Fluor 546, Alexa Fluor 568, Alexa Fluor 594, and 
Texas Red-X. 

A wide variety of other amine-reactive and thiol- 
reactive fluorophores are available commercially (Molecular 
Probes, Inc., Eugene, OR, USA), including Alexa Fluor® 350, 
Alexa Fluor® 488, Alexa Fluor® 532, Alexa Fluor® 546, Alexa 
Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 647 (monoclonal 
antibody labeling kits available from Molecular Probes, 
Inc., Eugene, OR, USA), BODIPY dyes, such as BODIPY 493/503, 
BODIPY FL, BODIPY R6G, BODIPY 530/550, BODIPY TMR, BODIPY 
558/568, BODIPY 558/568, BODIPY 564/570, BODIPY 576/589, 
BODIPY 581/591, BODIPY TR, BODIPY 630/650, BODIPY 650/665, 
Cascade Blue, Cascade Yellow, Dansyl, lissamine rhodamine B, 
Marina Blue, Oregon Green 488, Oregon Green 514, Pacific 
Blue, rhodamine 6G, rhodamine green, rhodamine red, 



tetramethylrhodamine, Texas Red (available from Molecular 
Probes, Inc., Eugene, OR, USA). 

The polypeptides of the present invention can als 
be conjugated to f luorophores, other proteins, and other 
macromolecules, using bifunctional linking reagents. 

Common homobif unctional reagents include, e.g., 
APG, AEDP, BASED, BMB, BMDB, BMH, BMOE, BM[PE0]3, BM[PE0]4, 
BS3, BSOCOES, DFDNB, DMA, DMP, DMS, DPDPB, DSG, DSP 
(Lomant's Reagent), DSS, DST, DTBP, DTME, DTSSP, EGS, HBVS, 
Sulfo-BSOCOES, Sulfo-DST, Sulfo-EGS (all available from 
Pierce, Rockford, IL, USA) ; common heterobifunctional 
cross-linkers include ABH, AMAS, ANB-NOS, APDP, ASBA, BMPA, 
BMPH, BMPS, EDC, EMC A, EMCH, EMCS, KMUA, KMUH, GMBS, 
LC-SMCC, LC-SPDP, MBS, M2C2H, MPBH, MSA, NHS -ASA, PDPH, 
PMPI, SADP, SAED, SAND, SANPAH, SASD, SATP, SBAP, SFAD, SIA, 
SIAB, SMCC, SMPB, SMPH, SMPT, SPDP, Sulfo-EMCS, Sulfo-GMBS, 
Sulfo-HSAB, Sulfo-KMUS, Sulf o-LC-SPDP, Sulfo-MBS, 
Sulfo-NHS-LC-ASA, Sulfo-SADP, Sulf o-SANPAH, Sulfo-SIAB, 
Sulfo-SMCC, Sulfo-SMPB, Sulf o-LC-SMPT, SVSB, TFCS (all 
available Pierce, Rockford, IL, USA) . 

The proteins, protein fragments, and protein 
fusions of the present invention can be conjugated, using 
such cross-linking reagents, to fluorophores that are not 
amine- or thiol-reactive . 

Other labels that usefully can be conjugated to 
the proteins, protein fragments, and fusion proteins of the 
present invention include radioactive labels, 
echosonographic contrast reagents, and MR I contrast agents. 

The proteins, protein fragments, and protein 
fusions of the present invention can also usefully be 
conjugated using cross-linking agents to carrier proteins, 
such as KLH, bovine thyroglobulin, and even bovine serum 
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albumin (BSA) , to increase immunogenicity for raising anti- 
MDZ3, anti-MDZ4, anti-MDZ7 or anti-MDZ12 antibodies. 

The proteins, protein fragments, and protein 
fusions of the present invention can also usefully be 
5 conjugated to polyethylene glycol (PEG) ; PEGylation 

increases the serum half life of proteins administered 
intravenously for replacement therapy. Delgado et al. f 
Crit. Rev. Ther. Drug Carrier Syst. 9 (3-4) : 249-304 (1992); 
Scott et al., Curr. Pharm. Des. 4(6):423-38 (1998); 
10 DeSantis et al . , Curr. Opin. Biotechnol. 10(4):324-30 
(1999), incorporated herein by reference in their 
entireties. PEG monomers can be attached to the protein 
directly or through a linker, with PEGylation using PEG 
monomers activated with tresyl chloride 
15 (2, 2, 2-trif luoroethanesulphonyl chloride) permitting direct 
attachment under mild conditions. 

The isolated proteins of the present invention, 
including fusions thereof, can be produced by recombinant 
expression, typically using the expression vectors of the 
20 present invention as above-described or, if fewer than about 
100 amino acids, by chemical synthesis (typically, solid 
phase synthesis), and, on occasion, by in vitro translation. 

Production of the isolated proteins of the present 
invention can optionally be followed by purification. 
25 Purification of recombinantly expressed proteins 

is now well within the skill in the art. See, e.g., Thorner 
et al. (eds.), Applications of Chimeric Genes and Hybrid 
Proteins, Part A: Gene Expression and Protein Purification 
(Methods in Enzymology, Volume 326), Academic Press (2000), 
30 (ISBN: 0121822273); Harbin (ed.), Cloning, Gene Expression 
and Protein Purification : Experimental Procedures and 



Process Rationale , Oxford Univ. Press (2001) (ISBN: 
0195132947); Marshak et al., Strategies for Protein 



Purification and Characterization: A Laboratory Course 
Manual , Cold Spring Harbor Laboratory Press (1996) (ISBN: 
0-87969-385-1); and Roe (ed.), Protein Purification 
Applications , Oxford University Press (2001), the 
disclosures of which are incorporated herein by reference in 
their entireties, and thus need not be detailed here. 

Briefly, however, if purification tags have been 
fused through use of an expression vector that appends such 
tag, purification can be effected, at least in part, by 
means appropriate to the tag, such as use of immobilized 
metal affinity chromatography for polyhistidine tags. Other 
techniques common in the art include ammonium sulfate 
fractionation, immunoprecipitation, fast protein liquid 
chromatography (FPLC) , high performance liquid 
chromatography (HPLC) , and preparative gel electrophoresis. 

Purification of chemically-synthesized peptides 
can readily be effected, e.g., by HPLC. 

Accordingly, it is an aspect of the present 
invention to provide the isolated proteins of the present 
invention in pure or substantially pure form. 

A purified protein of the present invention is an 
isolated protein, as above described, that is present at a 
concentration of at least 95%, as measured on a weight basis 
(w/w) with respect to total protein in a composition. Such 
purities can often be obtained during chemical synthesis 
without further purification, as, e.g., by HPLC. Purified 
proteins of the present invention can be present at a 
concentration (measured on a weight basis with respect to 
total protein in a composition) of 96%, 97%, 98%, and even 
99%. The proteins of the present invention can even be 
present at levels of 99.5%, 99.6%, and even 99.7%, 99.8%, or 
even 99.9% following purification, as by HPLC. 
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Although high levels of purity are particularly 
useful when the isolated proteins of the present invention 
are used as therapeutic agents - such as vaccines, or for 
replacement therapy - the isolated proteins of the present 
invention are also useful at lower purity. For example, 
partially purified proteins of the present invention can be 
used as immunogens to raise antibodies in laboratory 
animals . 

Thus, in another aspect, the present invention 
provides the isolated proteins of the present invention in 
substantially purified form. A "substantially purified 
protein" of the present invention is an isolated protein, as 
above described, present at a concentration of at least 70%, 
measured on a weight basis with respect to total protein in 
a composition. Usefully, the substantially purified protein 
is present at a concentration, measured on a weight basis 
with respect to total protein in a composition, of at least 
75%, 80%, or even at least 85%, 90%, 91%, 92%, 93%, 94%, 
94.5% or even at least 94.9%. 

In preferred embodiments, the purified and 
substantially purified proteins of the present invention are 
in compositions that lack detectable ampholytes, acrylamide 
monomers, bis-acrylamide monomers, and polyacrylamide . 

The proteins, fragments, and fusions of the 
present invention can usefully be attached to a substrate. 
The substrate can porous or solid, planar or non-planar; the 
bond can be covalent or noncovalent. 

For example, the proteins, fragments, and fusions 
of the present invention can usefully be bound to a porous 
substrate, commonly a membrane, typically comprising 
nitrocellulose, polyvinylidene fluoride (PVDF) , or 
cationically derivatized, hydrophilic PVDF; so bound, the 
proteins, fragments, and fusions of the present invention 
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can be used to detect and quantify antibodies, e.g. in 
serum, that bind specifically to the immobilized protein of 
the present invention. 

As another example, the proteins, fragments, and 
fusions of the present invention can usefully be bound to a 
substantially nonporous substrate, such as plastic, to 
detect and quantify antibodies, e.g. in serum, that bind 
specifically to the immobilized protein of the present 
invention. Such plastics include polymethylacrylic, 
polyethylene, polypropylene, polyacrylate, 
polymethylmethacrylate, polyvinylchloride, 
polytetrafluoroethylene, polystyrene, polycarbonate, 
polyacetal, polysulfone, celluloseacetate, cellulosenitrate, 
nitrocellulose, or mixtures thereof; when the assay is 
performed in standard microtiter dish, the plastic is 
typically polystyrene. 

The proteins, fragments, and fusions of the 
present invention can also be attached to a substrate 
suitable for use as a surface enhanced laser desorption 
ionization source; so attached, the protein, fragment, or 
fusion of the present invention is useful for binding and 
then detecting secondary proteins that bind with sufficient 
affinity or avidity to the surface-bound protein to indicate 
biologic interaction therebetween. The proteins, fragments, 
and fusions of the present invention can also be attached to 
a substrate suitable for use in surface plasmon resonance 
detection; so attached, the protein, fragment, or fusion of 
the present invention is useful for binding and then 
detecting secondary proteins that bind with sufficient 
affinity or avidity to the surface-bound protein to indicate 
biological interaction therebetween. 
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MDZ3 Proteins 

In a first series of protein embodiments, the 
invention provides an isolated MDZ3 polypeptide having an 
5 amino acid sequence encoded by the cDNA in ATCC Deposit No. 

. f or the amino acid sequence in SEQ ID NO: 

3, which are full length MDZ3 proteins. When used as 
immunogens, the full length proteins of the present 
invention can be used, inter alia, to elicit antibodies that 
10 bind to a variety of epitopes of the MDZ3 protein. 

The invention further provides fragments of the 
above-described polypeptides, particularly fragments having 
at least 6 amino acids, typically at least 8 amino acids, 
often at least 15 amino acids, and even the entirety of the 
15 sequence given in SEQ ID NO: 3. 

The invention further provides fragments of at 
least 6 amino acids, typically at least 8 amino acids, often 
at least 15 amino acids, and even the entirety of the 
sequence given in SEQ ID NO: 7. 
20 As described above, the invention further provides 

proteins that differ in sequence from those described with 
particularity in the above-referenced SEQ ID NOs, whether by 
way of insertion or deletion, by way of conservative or 
moderately conservative substitutions, as hybridization 
25 related proteins, or as cross-hybridizing proteins. 

Particularly useful among the above-described 
proteins are those having at least one C2H2 (Kruppel family) 
zinc finger, and especially those that have a plurality of 
C2H2 zinc fingers in tandem, particularly those that have 7 
30 tandem C2H2 zinc fingers. Also particularly useful among 
the above-described fragments are those having a SCAN 
domain, those that encode a KRAB domain, and those that 
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include all of a SCAN domain, KRAB domain, and 7 zinc 
fingers . 

Also particularly useful are those proteins that 
have sequence-specific nucleic acid binding regulatory 
5 activity, and that participates in protein-protein 
interactions with other transcription modulators. 

The invention further provides fusions of the 
proteins and protein fragments herein described to 
heterologous polypeptides . 

10 

MDZ4 Proteins 

In a first series of protein embodiments, the 
invention provides an isolated MDZ4 polypeptide having an 

15 amino acid sequence encoded by the cDNA in ATCC Deposit No. 

/ or the amino acid sequence in SEQ ID 

NO: 3029, which are full length MDZ4 proteins. When used as 
immunogens, the full length proteins of the present 
invention can be used, inter alia, to elicit antibodies that 

20 bind to a variety of epitopes of the MDZ4 protein. 

The invention further provides fragments of the 
above-described polypeptides, particularly fragments having 
at least 6 amino acids, typically at least 8 amino acids, 
often at least 15 amino acids, and even the entirety of the 

25 sequence given in SEQ ID NO: 3029. 

The invention further provides fragments of at 
least 6 amino acids, typically at least 8 amino acids, often 
at least 15 amino acids, and even the entirety of the 
sequence given in SEQ ID Nos: 3033 and 3037. 

30 As described above, the invention further provides 

proteins that differ in sequence from those described with 
particularity in the above-referenced SEQ ID NOs . , whether 
by way of insertion or deletion, by way of conservative or 
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moderately conservative substitutions, as hybridization 
related proteins, or as cross-hybridizing proteins. 

Particularly useful among such proteins are those 
that have at least one C2H2 (Kruppel family) zinc finger, 
and especially those that have 5 C2H2 zinc fingers in 
tandem, those that have a SCAN domain, and those that 
include all of a SCAN domain and 5 zinc fingers. 

Also particularly useful among the above-described 
MDZ4 proteins are those that have sequence-specific nucleic 
acid binding regulatory activity, and that participate in 
protein-protein interactions with other transcription 
modulators . 

The invention further provides fusions of the 
proteins and protein fragments herein described to 
heterologous polypeptides. 

MDZ7 Proteins 

In a first series of protein embodiments, the 
invention provides an isolated MDZ7 polypeptide having an 
amino acid sequence encoded by the cDNA in ATCC Deposit No. 

. / or the amino acid sequence in SEQ ID 

NO: 4409, which are full length MDZ7 proteins. When used as 
immunogens, the full length proteins of the present 
invention can be used, inter alia, to elicit antibodies that 
bind to a variety of epitopes of the MDZ7 protein. 

The invention further provides fragments of the 
above-described polypeptides, particularly fragments having 
at least 6 amino acids, typically at least 8 amino acids, 
often at least 15 amino acids, and even the entirety of the 
sequence given in SEQ ID NO: 4409. 

As described above, the invention further provides 
proteins that differ in sequence from those described with 
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particularity in the above-referenced SEQ ID NOs, whether by 
way of insertion or deletion, by way of conservative or 
moderately conservative substitutions, as hybridization 
related proteins, or as cross-hybridizing proteins, with 
those that substantially retain a MDZ7 activity particularly 
useful , 

Particularly useful among the above-described MDZ7 
proteins are those that have at least one C2H2 (Kruppel 
family) zinc finger, especially those having a plurality of 
zinc fingers in tandem, particularly those having 7 zinc 
fingers in tandem. 

Also particularly useful among the above-described 
MDZ7 proteins are those that have sequence-specific nucleic 
acid binding regulatory activity, and that function in 
sequence-specific modulation of gene expression. 

The invention further provides fusions of the 
proteins and protein fragments herein described to 
heterologous polypeptides . 

MDZ12 Proteins 

In a first series of protein embodiments, the 
invention provides an isolated MDZ12a polypeptide having an 
amino acid sequence encoded by the cDNA in ATCC Deposit No. 

- _/ or the amino acid sequence in SEQ ID NO: 

5772, which are full length MDZ12a proteins. When used as 
immunogens, the full length proteins of the present 
invention can be used, inter alia, to elicit antibodies that 
bind to a variety of epitopes of the MDZ12a protein. 

The invention further provides fragments of the 
above-described polypeptides, particularly fragments having 
at least 6 amino acids, typically at least 8 amino acids, 
often at least 15 amino acids, and even the entirety of the 
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sequence given in SEQ ID NO: 5772. 

The invention further provides fragments of at 
least 6 amino acids, typically at least 8 amino acids, often 
at least 15 amino acids, and even the entirety of the 
sequence given in SEQ ID NO: 5774, 

In another series of protein embodiments, the 
invention provides an isolated MDZ12bS polypeptide having an 
amino acid sequence encoded by the MDZ12bS part of the 

MDZ12b cDNA in ATCC Deposit No. , or the amino 

acid sequence in SEQ ID NO: 6939, a full length MDZ12bS 
protein. When used as immunogens, the full length proteins 
of the present invention can be used, inter alia f to elicit 
antibodies that bind to a variety of epitopes of the MDZ12bS 
protein . 

The invention further provides fragments of the 
above-described polypeptides, particularly fragments having 
at least 6 amino acids, typically at least 8 amino acids, 
often at least 15 amino acids, and even the entirety of the 
sequence given in SEQ ID NO: 6939. 

In another series of protein embodiments, the 
invention provides an isolated MDZ12bL polypeptide having an 
amino acid sequence encoded by the MDZ12bL portion of the 

MDZ12b cDNA in ATCC Deposit No. , or the 

amino acid sequence in SEQ ID NO: 6940, which is a full 
length MDZ12bL protein. When used as immunogens, the full 
length proteins of the present invention can be used, inter 
alia, to elicit antibodies that bind to a variety of 
epitopes of the MDZ12bL protein. 

The invention further provides fragments of the 
above-described polypeptides, particularly fragments having 
at least 6 amino acids, typically at least 8 amino acids, 
often at least 15 amino acids, and even the entirety of the 
sequence given in SEQ ID NO: 694 0. 



The invention further provides fragments of at 
least 6 amino acids, typically at least 8 amino acids, often 
at least 15 amino acids, and even the entirety of the 
sequence given in SEQ ID NO: 6942. 

As described above, the invention further provides 
proteins that differ in sequence from those described with 
particularity in the above-referenced SEQ ID NOs . , whether 
by way of insertion or deletion, by way of conservative or 
moderately conservative substitutions, as hybridization 
related proteins, or as cross-hybridizing proteins, with 
those that substantially retain a MDZ12 activity 
particularly useful. 

Particularly useful among the above-described 
MDZ12 proteins are those that have a C2H2 (Kruppel family) 
zinc finger, particularly those having a plurality of such 
zinc fingers in tandem, especially those having at least 5, 
often at least 12, zinc fingers in tandem. Also 
particularly useful among the above-described proteins are 
those that have a KRAB-B domain, especially those having 
both a KRAB domain and at least one, preferably a plurality, 
especially at least 10, often at least 12, zinc finger 
domains . 

Particularly useful proteins are those that act as 
sequence-specific transcription regulators, and that 
interaction with other transcriptional modulators by 
protein-protein interactions. 

The invention further provides fusions of the 
proteins and protein fragments herein described to 
heterologous polypeptides. 
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ANTIBODIES AND ANTIBODY- PRODUCING CELLS 

In another aspect, the invention provides 
antibodies, including fragments and derivatives thereof, 
that bind specifically to MDZ3, MDZ4, MDZ7 or MDZ12 proteins 
and protein fragments of the present invention or to one or 
more of the proteins and protein fragments encoded by the 
isolated MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids of the 
present invention. The antibodies of the present invention 
can be specific for all of linear epitopes, discontinuous 
epitopes, or conformational epitopes of such proteins or 
protein fragments, either as present on the protein in its 
native conformation or, in some cases, as present on the 
proteins as denatured, as, e.g., by solubilization in SDS. 

In other embodiments, the invention provides 
antibodies, including fragments and derivatives thereof, the 
binding of which can be competitively inhibited by one or 
more of the MDZ3, MDZ4, MDZ7 or MDZ12 proteins and protein 
fragments of the present invention, or by one or more of the 
proteins and protein fragments encoded by the isolated MDZ3, 
MDZ4, MDZ7 or MDZ12 nucleic acids of the present invention. 

As used herein, the term "antibody" refers to a 
polypeptide, at least a portion of which is encoded by at 
least one immunoglobulin gene, which can bind specifically 
to a first molecular species, and to fragments or 
derivatives thereof that remain capable of such specific 
binding. 

By "bind specifically" and "specific binding" is 
here intended the ability of the antibody to bind to a first 
molecular species in preference to binding to other 
molecular species with which the antibody and first 
molecular species are admixed. An antibody is said 



specifically to "recognize" a first molecular species when 
it can bind specifically to that first molecular species. 

As is well known in the art, the degree to which 
an antibody can discriminate as among molecular species in a 
mixture will depend, in part, upon the conformational 
relatedness of the species in the mixture; typically, the 
antibodies of the present invention will discriminate over 
adventitious binding to non-MDZ3, non-MDZ4, non-MDZ7 or non- 
MDZ12 proteins by at least two-fold, more typically by at 
least 5-fold, typically by more than 10-fold, 25-fold, 50- 
fold, 75-fold, and often by more than 100-fold, and on 
occasion by more than 500-fold or 1000-fold. When used to 
detect the proteins or protein fragments of the present 
invention, the antibody of the present invention is 
sufficiently specific when it can be used to determine the 
presence of the protein of the present invention in samples 
derived from human tissues expressing each of the genes 
(see, e.g., Examples 1, 2, 3, and 4). 

Typically, the affinity or avidity of an antibody 
(or antibody multimer, as in the case of an IgM pentamer) of 
the present invention for a protein or protein fragment of 
the present invention will be at least about 1 x 10" 6 molar 
(M) , typically at least about 5 x 10" 7 M, usefully at least 
about 1 x 10~ 7 M, with affinities and avidities of at least 
1 x 10" 8 M, 5 x 10~ 9 M, and 1 x 10' 10 M proving especially 
useful . 

The antibodies of the present invention can be 
naturally-occurring forms, such as IgG, IgM, IgD, IgE, and 
IgA, from any mammalian species. 

Human antibodies can, but will infrequently, be 
drawn directly from human donors or human cells. In such 
case, antibodies to the proteins of the present invention 
will typically have resulted from fortuitous immunization, 



such as autoimmune immunization, with the protein or protein 
fragments of the present invention. Such antibodies will 
typically, but will not invariably, be polyclonal. 

Human antibodies are more frequently obtained 
using transgenic animals that express human immunoglobulin 
genes, which transgenic animals can be affirmatively 
immunized with the protein immunogen of the present 
invention. Human Ig-transgenic mice capable of producing 
human antibodies and methods of producing human antibodies 
therefrom upon specific immunization are described, inter 
alia, in U.S. Patent Nos. 6,162,963; 6,150,584; 6,114,598; 
6,075,181; 5,939,598; 5,877,397; 5,874,299; 5,814,318; 
5, 789, 650; 5, 770, 429; 5,661,016; 5, 633, 425; 5, 625, 126; 
5,569,825; 5,545,807; 5,545,806, and 5, 591 , 669, the 
disclosures of which are incorporated herein by reference in 
their entireties. Such antibodies are typically monoclonal, 
and are typically produced using techniques developed for 
production of murine antibodies. 

Human antibodies are particularly useful, and 
often preferred, when the antibodies of the present 
invention are to be administered to human beings as in vivo 
diagnostic or therapeutic agents, since recipient immune 
response to the administered antibody will often be 
substantially less than that occasioned by administration of 
an antibody derived from another species, such as mouse. 

IgG, IgM, IgD, IgE and IgA antibodies of the 
present invention are also usefully obtained from other 
mammalian species, including rodents - typically mouse, but 
also rat, guinea pig, and hamster - lagomorphs, typically 
rabbits, and also larger mammals, such as sheep, goats, 
cows, and horses. In such cases, as with the transgenic 
human-antibody-producing non-human mammals, fortuitous 
immunization is not required, and the non-human mammal is 
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typically affirmatively immunized, according to standard 
immunization protocols, with the protein or protein fragment 
of the present invention. 

As discussed above, virtually all fragments of 8 
5 or more contiguous amino acids of the proteins of the 

present invention can be used effectively as immunogens when 
conjugated to a carrier, typically a protein such as bovine 
thyroglobulin, keyhole limpet hemocyanin, or bovine serum 
albumin, conveniently using a bifunctional linker such as 

10 those described elsewhere above, which discussion is 
incorporated by reference here. 

Immunogenicity can also be conferred by fusion of 
the proteins and protein fragments of the present invention 
to other moieties. 

15 For example, peptides of the present invention can 

be produced by solid phase synthesis on a branched 
polylysine core matrix; these multiple antigenic peptides 
(MAPs) provide high purity, increased avidity, accurate 
chemical definition and improved safety in vaccine 

2 0 development. Tarn et al . , Proc. Natl. Acad. Sci . USA 

85:5409-5413 (1988); Posnett et al . , J. Biol. Chem. 263, 
1719-1725 (1988) . 

Protocols for immunizing non-human mammals are 
well-established in the art, Harlow et al . (eds.), 
25 Antibodies: A Laboratory Manual , Cold Spring Harbor 

Laboratory (1998) (ISBN: 0879693142); Coligan et al . (eds.), 
Current Protocols in Immunology , John Wiley & Sons, Inc. 
(2001) (ISBN: 0-471-52276-7); Zola, Monoclonal Antibodies : 
Preparation and Use of Monoclonal Antibodies and Engineered 

3 0 Antibody Derivatives (Basics: From Background to Bench) , 

Springer Verlag (2000) (ISBN: 0387915907), the disclosures 
of which are incorporated herein by reference, and often 
include multiple immunizations, either with or without 



- 144 - 



adjuvants such as Freund's complete adjuvant and Freund's 
incomplete adjuvant . 

Antibodies from nonhuman mammals can be polyclonal 
or monoclonal, with polyclonal antibodies having certain 
5 advantages in immunohistochemical detection of the proteins 
of the present invention and monoclonal antibodies having 
advantages in identifying and distinguishing particular 
epitopes of the proteins of the present invention. 

Following immunization, the antibodies of the 

10 present invention can be produced using any art-accepted 
technique. Such techniques are well known in the art, 
Coligan et al . (eds.), Current Protocols in Immunology , John 
Wiley & Sons, Inc. (2001) (ISBN: 0-471-52276-7); Zola, 
Monoclonal Antibodies : Preparation and Use of Monoclonal 

15 Antibodies and Engineered Antibody Derivatives (Basics: From 
Background to Bench) , Springer Verlag (2000) (ISBN: 
0387915907); Howard et al . (eds.), Basic Methods in Antibody 
Production and Characterization , CRC Press (2000) (ISBN: 
0849394457); Harlow et al . (eds.), Antibodies: A Laboratory 

20 Manual , Cold Spring Harbor Laboratory (1998) (ISBN: 

0879693142); Davis (ed.), Monoclonal Antibody Protocols , 
Vol. 45, Humana Press (1995) (ISBN: 0896033082); Delves 
(ed.), Antibody Production: Essential Techniques , John Wiley 
& Son Ltd (1997) (ISBN: 0471970107); Kenney, Antibody 

25 Solution: An Antibody Methods Manual , Chapman & Hall (1997) 
(ISBN: 0412141914), incorporated herein by reference in 
their entireties, and thus need not be detailed here. 

Briefly, however, such techniques include, inter 
alia, production of monoclonal antibodies by hybridomas and 

30 expression of antibodies or fragments or derivatives thereof 
from host cells engineered to express immunoglobulin genes 
or fragments thereof. These two methods of production are 
not mutually exclusive: genes encoding antibodies specific 
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for the proteins or protein fragments of the present 
invention can be cloned from hybriciomas and thereafter 
expressed in other host cells. Nor need the two necessarily 
be performed together: e.g., genes encoding antibodies 
5 specific for the proteins and protein fragments of the 

present invention can be cloned directly from B cells known 
to be specific for the desired protein, as further described 
in U.S. Pat. No. 5,627,052, the disclosure of which is 
incorporated herein by reference in its entirety, or from 
10 antibody-displaying phage. 

Recombinant expression in host cells is 
particularly useful when fragments or derivatives of the 
antibodies of the present invention are desired. 

Host cells for recombinant antibody production — 
15 either whole antibodies, antibody fragments, or antibody 
derivatives — can be prokaryotic or eukaryotic. 

Prokaryotic hosts are particularly useful for 
producing phage displayed antibodies of the present 
invention. 

20 The technology of phage-displayed antibodies, in 

which antibody variable region fragments are fused, for 
example, to the gene III protein (pill) or gene VIII protein 
(pVTII) for display on the surface of filamentous phage, 
such as M13, is by now well-established, Sidhu, Curr. Opin. 

25 Biotechnol. 11(6): 610-6 (2000); Griffiths et al., Curr. 
Opin. Biotechnol. 9(1): 102-8 (1998); Hoogenboom et al . , 
Immunotechnology, 4(l):l-20 (1998); Rader et al., Current 
Opinion in Biotechnology 8:503-508 (1997); Aujame et al . , 
Human Antibodies 8:155-168 (1997); Hoogenboom, Trends in 

30 Biotechnol. 15:62-70 (1997); de Kruif et al . , 17:453-455 
(1996); Barbas et al . , Trends in Biotechnol. 14:230-234 
(1996); Winter et al., Ann. Rev. Immunol. 433-455 (1994), 
and techniques and protocols required to generate, 
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propagate, screen (pan) , and use the antibody fragments from 
such libraries have recently been compiled, Barbas et al. f 
Phage Display: A Laboratory Manual , Cold Spring Harbor 
Laboratory Press (2001) (ISBN 0-87969-546-3); Kay et al . 
5 (eds.) , Phage Display of Peptides and Proteins: A Laboratory 
Manual , Academic Press, Inc. (1996); Abelson et al. (eds.), 
Combinatorial Chemistry , Methods in Enzymology vol. 267, 
Academic Press (May 1996) , the disclosures of which are 
incorporated herein by reference in their entireties. 

10 Typically, phage-displayed antibody fragments are 

scFv fragments or Fab fragments; when desired, full length 
antibodies can be produced by cloning the variable regions 
from the displaying phage into a complete antibody and 
expressing the full length antibody in a further prokaryotic 

15 or a eukaryotic host cell. 

Eukaryotic cells are also useful for expression of 
the antibodies, antibody fragments, and antibody derivatives 
of the present invention. 

For example, antibody fragments of the present 

20 invention can be produced in Pichia pastoris, Takahashi et 
al. f Blosci. Biotechnol. Biochem. 64 (10) :2138-44 (2000); 
Freyre et al . , J. Biotechnol. 76 (2-3) : 157-63 (2000); Fischer 
et al., Biotechnol. Appl. Biochem. 30 (Pt 2):117-20 (1999); 
Pennell et al . , Res. Immunol. 149 ( 6) : 599-603 (1998); Eldin 

25 et al., J. Immunol. Methods. 201(1): 67-75 (1997); and in 
Saccharomyces cerevisiae, Frenken et al., Res. Immunol. 
149 (6) : 589-99 (1998); Shusta et al . , Nature Biotechnol. 
16(8): 773-7 (1998), the disclosures of which are 
incorporated herein by reference in their entireties. 

3 0 Antibodies, including antibody fragments and 

derivatives, of the present invention can also be produced 
in insect cells, Li et al., Protein Expr. Purif. 21(l):121-8 
(2001); Ailor et al . , Biotechnol. Bioeng. 58 (2-3) ; 196-203 
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(1998) ; Hsu et al., Biotechnol. Prog. 13(1): 96-104 (1997); 
Edelman et al . , Immunology 91(1): 13-9 (1997); and Nesbit et 
al., J. Immunol. Methods. 151 ( 1-2 ): 201-8 (1992), the 
disclosures of which are incorporated herein by reference in 

5 their entireties. 

Antibodies and fragments and derivatives thereof 
of the present invention can also be produced in plant 
cells, Giddings et al. f Nature Biotechnol. 18 ( 11 ): 1151-5 
(2000); Gavilondo et al., Biotechniques 29(1): 128-38 (2000); 
10 Fischer et al., J. Biol. Regul . Homeost. Agents 14(2):83-92 
(2000); Fischer et al., Biotechnol. Appl . Biochem. 30 (Pt 
2):113-6 (1999); Fischer et al., Biol. Chem. 380 (7-8) : 825-39 

(1999) ; Russell, Curr. Top. Microbiol. Immunol. 240:119-38 
(1999); and Ma et al., Plant Physiol. 109(2) :341-6 (1995), 

15 the disclosures of which are incorporated herein by 
reference in their entireties. 

Mammalian cells useful for recombinant expression 
of antibodies, antibody fragments, and antibody derivatives 
of the present invention include CHO cells, COS cells, 293 
20 cells, and myeloma cells. 

Verma et al., J. Immunol. Methods 216 (1-2) : 165-81 
(1998), review and compare bacterial, yeast, insect and 
mammalian expression systems for expression of antibodies. 

Antibodies of the present invention can also be 
25 prepared by cell free translation, as further described in 
Merk et al . , J. Biochem. (Tokyo). 125 (2 ): 328-33 (1999) and 
Ryabova et al., Nature Biotechnol. 15(1): 79-84 (1997), and 
in the milk of transgenic animals, as further described in 
Pollock et al. f J. Immunol. Methods 231 (1-2) : 147-57 (1999), 
30 the disclosures of which are incorporated herein by 
reference in their entireties. 

The invention further provides antibody fragments 
that bind specifically to one or more of the proteins and 
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protein fragments of the present invention, to one or more 
of the proteins and protein fragments encoded by the 
isolated nucleic acids of the present invention, or the 
binding of which can be competitively inhibited by one or 
5 more of the proteins and protein fragments of the present 
invention or one or more of the proteins and protein 
fragments encoded by the isolated nucleic acids of the 
present invention . 

Among such useful fragments are Fab, Fab 1 , Fv, 

10 F(ab)' 2 A and single chain Fv (scFv) fragments. Other useful 
fragments are described in Hudson, Curr. Opin. Biotechnol . 
9(4) :395-402 (1998) . 

It is also an aspect of the present invention to 
provide antibody derivatives that bind specifically to one 

15 or more of the proteins and protein fragments of the present 
invention, to one or more of the proteins and protein 
fragments encoded by the isolated nucleic acids of the 
present invention, or the binding of which can be 
competitively inhibited by one or more of the proteins and 

20 protein fragments of the present invention or one or more of 
the proteins and protein fragments encoded by the isolated 
nucleic acids of the present invention. 

Among such useful derivatives are chimeric, 
primatized, and humanized antibodies; such derivatives are 

25 less immunogenic in human beings, and thus more suitable for 
in vivo administration, than are unmodified antibodies from 
non-human mammalian species. 

Chimeric antibodies typically include heavy and/or 
light chain variable regions (including both CDR and 

30 framework residues) of immunoglobulins of one species, 
typically mouse, fused to constant regions of another 
species, typically human. See, e.g., U.S. Pat. No. 
5,807,715; Morrison et al . , Proc. Natl. Acad. Sci 
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USA. Bl (21) :6851-5 (1984); Sharon et al . , Nature 
309 (5966) : 364-7 (1984); Takeda et al . , Nature 
314 (6010) : 452-4 (1985), the disclosures of which are 
incorporated herein by reference in their entireties, 
5 Primatized and humanized antibodies typically include heavy 
and/or light chain CDRs from a murine antibody grafted into 
a non-human primate or human antibody V region framework, 
usually further comprising a human constant region, 
Riechmann et al . , Nature 332 ( 61 62 ): 323-7 (1988); Co et al., 

10 Nature 351 ( 6326) : 501-2 (1991); U.S. Pat. Nos . 6,054,297; 
5,821,337; 5,770,196; 5,766,886; 5,821,123; 5, 869, 619; 
6,180,377; 6,013,256; 5,693,761; and 6,180,370, the 
disclosures of which are incorporated herein by reference in 
their entireties. 

15 Other useful antibody derivatives of the invention 

include heteromeric antibody complexes and antibody fusions, 
such as diabodies (bispecific antibodies) , single-chain 
diabodies, and intrabodies. 

The antibodies of the present invention, including 

20 fragments and derivatives thereof, can usefully be labeled. 
It is, therefore, another aspect of the present invention 
to provide labeled antibodies that bind specifically to one 
or more of the proteins and protein fragments of the present 
invention, to one or more of the proteins and protein 

25 fragments encoded by the isolated nucleic acids of the 
present invention, or the binding of which can be 
competitively inhibited by one or more of the proteins and 
protein fragments of the present invention or one or more of 
the proteins and protein fragments encoded by the isolated 

30 nucleic acids of the present invention. 

The choice of label depends, in part, upon the 
desired use. 
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For example, when the antibodies of the present 
invention are used for immunohistochemical staining of 
tissue samples, the label can usefully be an enzyme that 
catalyzes production and local deposition of a detectable 
5 product. 

Enzymes typically conjugated to antibodies to 
permit their immunohistochemical visualization are well 
known, and include alkaline phosphatase, p-galactosidase, 
glucose oxidase, horseradish peroxidase (HRP) , and urease. 
10 Typical substrates for production and deposition of visually 
detectable products include 

o-nitrophenyl-beta-D-galactopyranoside (ONPG) ; 

o-phenylenediamine dihydrochloride (OPD) ; p-nitrophenyl 

phosphate (PNPP) ; p-nitrophenyl-beta-D-galactopryanoside 
15 (PNPG) ; 3 T , 3 T -diaminobenzidine (DAB); 3-amino-9- 

ethylcarbazole (AEC) ; 4-chloro-l-naphthol (CN) ; 

5-bromo-4-chloro-3-indolyl-phosphate (BCIP) ; ABTS®; BluoGal; 

iodonitrotetrazolium (INT) ; nitroblue tetrazolium chloride 

(NBT) ; phenazine methosulfate (PMS) ; phenolphthalein 
2 0 monophosphate (PMP) ; tetramethyl benzidine (TMB) ; 

tetranitroblue tetrazolium (TNBT) ; X-Gal; X-Gluc; and 

X-Glucoside ♦ 

Other substrates can be used to produce products 
for local deposition that are luminescent. For example, in 

25 the presence of hydrogen peroxide (H 2 0 2 ) , horseradish 
peroxidase (HRP) can catalyze the oxidation of cyclic 
diacylhydrazides, such as luminol. Immediately following 
the oxidation, the luminol is in an excited state 
(intermediate reaction product) , which decays to the ground 

30 state by emitting light. Strong enhancement of the light 
emission is produced by enhancers, such as phenolic 
compounds. Advantages include high sensitivity, high 
resolution, and rapid detection without radioactivity and 
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requiring only small amounts of antibody. See, e.g., Thorpe 
et al. f Methods Enzymol . 133:331-53 (1986); Kricka et al. f 
J. Immunoassay 17(1): 67-83 (1996); and Lundqvist et a 1 . , J. 
Biolumin. Chemilumin. 10(6):353-9 (1995), the disclosures of 
5 which are incorporated herein by reference in their 
entireties. Kits for such enhanced chemiluminescent 
detection (ECL) are available commercially. 

The antibodies can also be labeled using colloidal 

gold, 

10 As another example, when the antibodies of the 

present invention are used, e.g., for flow cytometric 
detection, for scanning laser cytometric detection, or for 
fluorescent immunoassay, they can usefully be labeled with 
f luorophores . 

15 There are a wide variety of fluorophore labels 

that can usefully be attached to the antibodies of the 
present invention . 

For flow cytometric applications, both for 
extracellular detection and for intracellular detection, 

20 common useful fluorophores can be fluorescein isothiocyanate 
(FITC), allophycocyanin (APC) , R-phycoerythrin (PE) , 
peridinin chlorophyll protein (PerCP) , Texas Red, Cy3, Cy5, 
fluorescence resonance energy tandem fluorophores such as 
PerCP-Cy5.5, PE-Cy5, PE-Cy5.5, PE-Cy7, PE-Texas Red, and 

25 APC-Cy7. 

Other fluorophores include, inter alia, Alexa 
Fluor® 350, Alexa Fluor® 488, Alexa Fluor® 532, Alexa Fluor® 
546, Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 647 
(monoclonal antibody labeling kits available from Molecular 
30 Probes, Inc., Eugene, OR, USA), BODIPY dyes, such as BODIPY 
493/503, BODIPY FL, BODIPY R6G, BODIPY 530/550, BODIPY TMR, 
BODIPY 558/568, BODIPY 558/568, BODIPY 564/570, BODIPY 
576/589, BODIPY 581/591, BODIPY TR, BODIPY 630/650, BODIPY 
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650/665, Cascade Blue, Cascade Yellow, Dansyl, lissamine 
rhodamine B, Marina Blue, Oregon Green 488, Oregon Green 
514, Pacific Blue, rhodamine 6G, rhodamine green, rhodamine 
red, tetramethylrhodamine, Texas Red (available from 
5 Molecular Probes, Inc., Eugene, OR, USA), and Cy2, Cy3, 
Cy3.5, Cy5, Cy5.5, Cy7, all of which are also useful for 
f luorescently labeling the antibodies of the present 
invention. 

For secondary detection using labeled avidin, 

10 streptavidin, captavidin or neutravidin, the antibodies of 
the present invention can usefully be labeled with biotin. 

When the antibodies of the present invention are 
used, e.g., for western blotting applications, they can 
usefully be labeled with radioisotopes, such as 33 P, 32 P, 35 S, 

15 3 H, and 125 I. 

As another example, when the antibodies of the 
present invention are used for radioimmunotherapy, the label 
can usefully be 228 Th, 227 Ac, 225 Ac, 223 Ra, 213 Bi, 212 Pb, 212 Bi, 
211 At, 203 Pb, 194 0s, 188 Re, 186 Re, 153 Sm, 149 Tb, 131 I, 125 I, in In, 

20 105 Rh, 99m Tc, 97 Ru, 90 Y, 90 Sr, 88 Y, 72 Se, 67 Cu, or ^ 7 Sc. 

As another example, when the antibodies of the 
present invention are to be used for in vivo diagnostic use, 
they can be rendered detectable by conjugation to MRI 
contrast agents, such as gadolinium 

25 diethylenetriaminepentaacetic acid (DTPA) , Lauffer et al., 

Radiology 207 (2) : 529-38 (1998), or by radioisotopic labeling 

As would be understood, use of the labels 
described above is not restricted to the application as for 
which they were mentioned. 

30 The antibodies of the present invention, including 

fragments and derivatives thereof, can also be conjugated to 
toxins, in order to target the toxin's ablative action to 
cells that display and/or express the proteins of the 
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present invention. Commonly, the antibody in such 
immunotoxins is conjugated to Pseudomonas exotoxin A, 
diphtheria toxin, shiga toxin A, anthrax toxin lethal 
factor, or ricin. See Hall (ed.), Immunotoxin Methods and 
5 Protocols (Methods in Molecular Biology, Vol 166), Humana 
Press (2000) (ISBN: 0896037754) ; and Frankel et al . (eds.), 
Clinical Applications of Immunotoxins , Springer-Verlag New 
York, Incorporated (1998) (ISBN: 3540640975) , the disclosures 
of which are incorporated herein by reference in their 

10 entireties, for review. 

The antibodies of the present invention can 
usefully be attached to a substrate, and it is, therefore, 
another aspect of the invention to provide antibodies that 
bind specifically to one or more of the proteins and protein 

15 fragments of the present invention, to one or more of the 
proteins and protein fragments encoded by the isolated 
nucleic acids of the present invention, or the binding of 
which can be competitively inhibited by one or more of the 
proteins and protein fragments of the present invention or 

2 0 one or more of the proteins and protein fragments encoded by 
the isolated nucleic acids of the present invention, 
attached to a substrate. 

Substrates can be porous or nonporous, planar or 

nonplanar . 

25 For example, the antibodies of the present 

invention can usefully be conjugated to filtration media, 
such as NHS-activated Sepharose or CNBr-activated Sepharose 
for purposes of immunoaf f inity chromatography. 

For example, the antibodies of the present 

30 invention can usefully be attached to paramagnetic 

microspheres, typically by biotin-streptavidin interaction, 
which microsphere can then be used for isolation of cells 
that express or display the proteins of the present 
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invention. As another example, the antibodies of the 
present invention can usefully be attached to the surface of 
a microtiter plate for ELISA. 

As noted above, the antibodies of the present 
5 invention can be produced in prokaryotic and eukaryotic 
cells. It is, therefore, another aspect of the present 
invention to provide cells that express the antibodies of 
the present invention, including hybridoma cells, B cells, 
plasma cells, and host cells recombinantly modified to 

10 express the antibodies of the present invention. 

In yet a further aspect, the present invention 
provides aptamers evolved to bind specifically to one or 
more of the proteins and protein fragments of the present 
invention, to one or more of the proteins and protein 

15 fragments encoded by the isolated nucleic acids of the 
present invention, or the binding of which can be 
competitively inhibited by one or more of the proteins and 
protein fragments of the present invention or one or more of 
the proteins and protein fragments encoded by the isolated 

20 nucleic acids of the present invention. 

MDZ3 Antibodies 

In a first series of antibody embodiments, the 
25 invention provides antibodies, both polyclonal and 

monoclonal, and fragments and derivatives thereof, that bind 
specifically to, or the binding of which can be 
competitively inhibited by, a polypeptide having an amino 
acid sequence encoded by the MDZ3 cDNA in ATCC Deposit No. 

30 , or having the amino acid sequence of SEQ ID 

No: 3. 

In a second series of antibody embodiments, the 
invention provides antibodies, both polyclonal and 
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monoclonal, and fragments and derivatives thereof, that bind 
specifically to, or the binding of which can be 
competitively inhibited by, a polypeptide having the amino 
acid sequence of SEQ ID NO; 7. 
5 In a third series of antibody embodiments/ the 

invention provides antibodies, both polyclonal and 
monoclonal, and fragments and derivatives thereof, that bind 
specifically to, or the binding of which can be 
competitively inhibited by, polypeptides encoded by any of 
10 the MDZ3 nucleic acids of the present invention, as above- 
described. 

Such antibodies are useful in in vitro 
immunoassays, such as ELISA, western blot or 

immunohistochemical assay, for detection of MDZ3 and related 
15 proteins. Such antibodies are also useful in isolating and 
purifying MDZ3 proteins, including related cross-reactive 
proteins, by immunoprecipitation, immunoaf f inity 
chromatography, or magnetic bead-mediated purification. 

In other embodiments, the invention further 
20 provides the above-described antibodies detectably labeled, 
and in yet other embodiments, provides the above-described 
antibodies attached to a substrate. 

MDZ4 Antibodies 

25 

In a first series of antibody embodiments, the 
invention provides antibodies, both polyclonal and 
monoclonal, and fragments and derivatives thereof, that bind 
specifically to, or the binding of which can be 
30 competitively inhibited by, a polypeptide having an amino 
acid sequence encoded by the MDZ4 cDNA in ATCC Deposit No. 

, or having the amino acid sequence of SEQ ID 

No: 3 029. 
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In a second series of antibody embodiments/ the 
invention provides antibodies, both polyclonal and 
monoclonal, and fragments and derivatives thereof, that bind 
specifically to, or the binding of which can be 
5 competitively inhibited by, a polypeptide having the amino 
acid sequence of SEQ ID NO: 3033. 

In a third series of antibody embodiments, the 
invention provides antibodies, both polyclonal and 
monoclonal, and fragments and derivatives thereof, that bind 
10 specifically to, or the binding of which can be 

competitively inhibited by, a polypeptide having the amino 
acid sequence of SEQ ID NO: 3037. 

In a fourth series of antibody embodiments, the 
invention provides antibodies, both polyclonal and 
15 monoclonal, and fragments and derivatives thereof, that bind 
specifically to, or the binding of which can be 
competitively inhibited by, polypeptides encoded by any of 
the MDZ7 nucleic acids of the present invention, as above- 
described. 

20 Such antibodies are useful in in vitro 

immunoassays, such as ELISA, western blot or 

immunohistochemical assay, for detection of MDZ3 and related 
proteins. Such antibodies are also useful in isolating and 
purifying MDZ4 proteins, including related cross-reactive 

25 proteins, by immunoprecipitation, immunoaf f inity 

chromatography, or magnetic bead-mediated purification. 

In other embodiments, the invention further 
provides the above-described antibodies detectably labeled, 
and in yet other embodiments, provides the above-described 

30 antibodies attached to a substrate. 
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MDZ7 Antibodies 

In a first series of antibody embodiments of this 
aspect of the invention, the invention provides antibodies, 
5 both polyclonal and monoclonal/ and fragments and 

derivatives thereof, that bind specifically to, or the 
binding of which can be competitively inhibited by, a 
polypeptide having an amino acid sequence encoded by the 

MDZ7 cDNA in ATCC Deposit No. , or having the 

10 amino acid sequence of SEQ ID No: 4409. 

Such antibodies are useful in in vitro 
immunoassays, such as ELISA, western blot or 

immunohistochemical assay, for detection of MDZ7 and related 
proteins. Such antibodies are also useful in isolating and 

15 purifying MDZ7 proteins, including related cross-reactive 
proteins, by immunoprecipitation, immunoaf f inity 
chromatography, or magnetic bead-mediated purification. 

In other embodiments, the invention further 
provides the above-described antibodies detectably labeled, 

20 and in yet other embodiments, provides the above-described 
antibodies attached to a substrate. 

MDZ12a, MDZ12bS and MDZ12bL Antibodies 

25 The invention further provides antibodies, 

polyclonal or monoclonal, and fragments and derivatives 
thereof, that bind specifically to, or the binding of which 
can be competitively inhibited by, a polypeptide having an 
amino acid sequence encoded by the MDZ12a cDNA in ATCC 

3 0 Deposit No. , or having the amino acid 

sequence of SEQ ID No: 5772. Depending upon the epitope 
recognized, certain of such antibodies can cross-react with 
MDZ12bS, others with MDZ12bL, and yet others with neither 
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MDZ12bS or MDZ12bL . These subsets can be discriminated by 
screening the antibodies for ability to bind to the MDZ12bS 
and MDZ12bL polypeptides of the present invention, the 
sequences of which are set forth respectively in SEQ ID NOS : 
5 6939 and 6940. 

In another series of embodiments, the invention 
provides antibodies, both polyclonal and monoclonal, and 
fragments and derivatives thereof, that bind specifically 
to, or the binding of which can be competitively inhibited 
10 by, a polypeptide having the amino acid sequence of SEQ ID 
NO:5774. 

In another series of embodiments, the invention 
provides antibodies, including fragments and derivatives 
thereof, both monoclonal and polyclonal, that bind 

15 specifically to, or the binding of which can be 

competitively inhibited by, a MDZ12bS polypeptide encoded by 

ATCC Deposit No. , or having the amino 

acid sequence of SEQ ID No: 6939. Such antibodies will 
cross-react with MDZ12a but not with MDZ12bL. 

20 In yet another series of embodiments, the 

invention provides antibodies, including fragments and 
derivatives thereof, monoclonal and polyclonal, that bind 
specifically to, or the binding of which can be 
competitively inhibited by, a MDZ12bL polypeptide encoded by 

25 ATCC Deposit No. , or having the amino acid 

sequence of SEQ ID No: 6940. Such antibodies will cross- 
react with MDZ12a but not with MDZ12bS. 

Such antibodies are useful in in vitro 
immunoassays, such as ELISA, western blot or 

30 immunohistochemical assay, for detection of MDZ7 and related 
proteins. Such antibodies are also useful in isolating and 
purifying MDZ12 proteins, including related cross-reactive 
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proteins, by immunoprecipitation, immunoaf f inity 
chromatography, or magnetic bead-mediated purification. 

In other embodiments, the invention further 
provides the above-described antibodies detectably labeled, 
5 and in yet other embodiments, provides the above-described 
antibodies attached to a substrate, 

PHARMACEUTICAL COMPOSITIONS 

10 MDZ3, MDZ4, MDZ7 and MDZ12 are important for 

transcriptional regulation and protein-protein interactions 
with other transcription modulators; defects in MDZ3, MDZ4, 
MDZ7 or MDZ12 expression, activity, distribution, 
localization, and/or solubility are a cause of human 

15 disease, which disease can manifest as a disorder of brain, 
testis, heart or bone marrow function for MDZ3; bone marrow, 
brain, heart, hela, adult liver, fetal liver, lung, placenta 
or prostate function for MDZ4; testes fucntion for MDZ7; and 
brain, heart, kidney, placenta, skeletal muscle, testis, 

20 bone marrow or liver function for MDZ12. 

Accordingly, pharmaceutical compositions 
comprising nucleic acids, proteins, and antibodies of the 
present invention, as well as mimetics, agonists, 
antagonists, or inhibitors of MDZ3, MDZ4, MDZ7 or MDZ12 

25 activity, can be administered as therapeutics for treatment 
of MDZ3, MDZ4, MDZ7 or MDZ12 defects, respectively. 

Thus, in another aspect, the invention provides 
pharmaceutical compositions comprising the nucleic acids, 
nucleic acid fragments, proteins, protein fusions, protein 

30 fragments, antibodies, antibody derivatives, antibody 

fragments, mimetics, agonists, antagonists, and inhibitors 
of the present invention. 
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Such a composition typically contains from about 
0.1 to 90% by weight of a therapeutic agent of the invention 
formulated in and/or with a pharmaceutically acceptable 
carrier or excipient. 
5 Pharmaceutical formulation is a well-established 

art, and is further described in Gennaro (ed.), Remington: 
The Science and Practice of Pharmacy , 20 th ed., Lippincott, 
Williams & Wilkins (2000) (ISBN: 0683306472); Ansel et al . , 
Pharmaceutical Dosage Forms and Drug Delivery Systems , 7 th 

10 ed., Lippincott Williams & Wilkins Publishers (1999) (ISBN: 
0683305727); and Kibbe (ed.), Handbook of Pharmaceutical 
Excipients American Pharmaceutical Association, 3 rd ed. 
(2000) (ISBN: 091733096X) , the disclosures of which are 
incorporated herein by reference in their entireties, and 

15 thus need not be described in detail herein. 

Briefly, however, formulation of the 
pharmaceutical compositions of the present invention will 
depend upon the route chosen for administration. The 
pharmaceutical compositions utilized in this invention can 

20 be administered by various routes including both enteral and 
parenteral routes, including oral, intravenous, 
intramuscular, subcutaneous, inhalation, topical, 
sublingual, rectal, intra- arterial, intramedullary, 
intrathecal, intraventricular, transmucosal, transdermal , 

25 intranasal, intraperitoneal, intrapulmonary, and 
intrauterine . 

Oral dosage forms can be formulated as tablets, 
pills, dragees, capsules, liquids, gels, syrups, slurries, 
suspensions, and the like, for ingestion by the patient. 

30 Solid formulations of the compositions for oral 

administration can contain suitable carriers or excipients, 
such as carbohydrate or protein fillers, such as sugars, 
including lactose, sucrose, mannitol, or sorbitol; starch 
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from corn, wheat, rice, potato, or other plants; cellulose, 
such as methyl cellulose, hydroxypropylmethyl-cellulose, 
sodium carboxymethylcellulose, or microcrystalline 
cellulose; gums including arable and tragacanth; proteins 
5 such as gelatin and collagen; inorganics, such as kaolin, 

calcium carbonate, dicalcium phosphate, sodium chloride; and 
other agents such as acacia and alginic acid. 

Agents that facilitate disintegration and/or 
solubilization can be added, such as the cross-linked 
10 polyvinyl pyrrolidone, agar, alginic acid, or a salt 
thereof, such as sodium alginate, microcrystalline 
cellulose, corn starch, sodium starch glycolate, and alginic 
acid. 

Tablet binders that can be used include acacia, 
15 methylcellulose, sodium carboxymethylcellulose, 
polyvinylpyrrolidone (Povidone™) , hydroxypropyl 
methylcellulose, sucrose, starch and ethylcellulose . 

Lubricants that can be used include magnesium 
stearates, stearic acid, silicone fluid, talc, waxes, oils, 
20 and colloidal silica. 

Fillers, agents that facilitate disintegration 
and/or solubilization, tablet binders and lubricants, 
including the aforementioned, can be used singly or in 
combination. 

25 Solid oral dosage forms need not be uniform 

throughout . 

For example, dragee cores can be used in 

conjunction with suitable coatings, such as concentrated 

sugar solutions, which can also contain gum arabic, talc, 
30 polyvinylpyrrolidone, carbopol gel, polyethylene glycol, 

and/or titanium dioxide, lacquer solutions, and suitable 

organic solvents or solvent mixtures. 
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Oral dosage forms of the present invention include 
push-fit capsules made of gelatin, as well as soft, sealed 
capsules made of gelatin and a coating, such as glycerol or 
sorbitol. Push-fit capsules can contain active ingredients 
5 mixed with a filler or binders, such as lactose or starches, 
lubricants, such as talc or magnesium stearate, and, 
optionally, stabilizers. In soft capsules, the active 
compounds can be dissolved or suspended in suitable liquids, 
such as fatty oils, liquid, or liquid polyethylene glycol 
10 with or without stabilizers. 

Additionally, dyestuffs or pigments can be added 
to the tablets or dragee coatings for product identification 
or to characterize the quantity of active compound, i.e., 
dosage . 

15 Liquid formulations of the pharmaceutical 

compositions for oral (enteral) administration are prepared 
in water or other aqueous vehicles and can contain various 
suspending agents such as methylcellulose, alginates, 
tragacanth, pectin, kelgin, carrageenan, acacia, 

20 polyvinylpyrrolidone, and polyvinyl alcohol. The liquid 

formulations can also include solutions, emulsions, syrups 
and elixirs containing, together with the active 
compound(s), wetting agents, sweeteners, and coloring and 
flavoring agents. 

25 The pharmaceutical compositions of the present 

invention can also be formulated for parenteral 
administration. 

For intravenous injection, water soluble versions 
of the compounds of the present invention are formulated in, 

30 or if provided as a lyophilate, mixed with, a 

physiologically acceptable fluid vehicle, such as 5% 
dextrose ("D5"), physiologically buffered saline, 0.9% 
saline, Hanks 1 solution, or Ringer's solution. 
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Intramuscular preparations, e.g. a sterile 
formulation of a suitable soluble salt form of the compounds 
of the present invention, can be dissolved and administered 
in a pharmaceutical excipient such as Water-f or-Injection, 
5 0.9% saline, or 5% glucose solution. Alternatively, a 

suitable insoluble form of the compound can be prepared and 
administered as a suspension in an aqueous base or a 
pharmaceutically acceptable oil base, such as an ester of a 
long chain fatty acid (e.g., ethyl oleate) , fatty oils such 

10 as sesame oil, triglycerides, or liposomes. 

Parenteral formulations of the compositions can 
contain various carriers such as vegetable oils, 
dimethylacetamide, dimethyl formamide, ethyl lactate, ethyl 
carbonate, isopropyl myristate, ethanol, polyols (glycerol, 

15 propylene glycol, liquid polyethylene glycol, and the like) . 

Aqueous injection suspensions can also contain 
substances that increase the viscosity of the suspension, 
such as sodium carboxymethyl cellulose, sorbitol, or 

20 dextran. Non-lipid polycationic amino polymers can also be 
used for delivery. Optionally, the suspension can also 
contain suitable stabilizers or agents that increase the 
solubility of the compounds to allow for the preparation of 
highly concentrated solutions. 

25 Pharmaceutical compositions of the present 

invention can also be formulated to permit injectable, long- 
term, deposition. 

The pharmaceutical compositions of the present 
invention can be administered topically. 

30 A topical semi-solid ointment formulation 

typically contains a concentration of the active ingredient 
from about 1 to 20%, e.g., 5 to 10%, in a carrier such as a 
pharmaceutical cream base. Various formulations for topical 
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use include drops, tinctures, lotions, creams, solutions, 
and ointments containing the active ingredient and various 
supports and vehicles. In other transdermal formulations, 
typically in patch-delivered formulations, the 
5 pharmaceutically active compound is formulated with one or 
more skin penetrants, such as 2-N-methyl-pyrrolidone (NMP) 
or Azone. 

Inhalation formulations can also readily be 
formulated. For inhalation, various powder and liquid 

10 formulations can be prepared. 

The pharmaceutically active compound in the 
pharmaceutical compositions of the present inention can be 
provided as the salt of a variety of acids, including but 
not limited to hydrochloric, sulfuric, acetic, lactic, 

15 tartaric, malic, and succinic acid. Salts tend to be more 
soluble in aqueous or other protonic solvents than are the 
corresponding free base forms. 

After pharmaceutical compositions have been 
prepared, they are packaged in an appropriate container and 

20 labeled for treatment of an indicated condition. 

The active compound will be present in an amount 
effective to achieve the intended purpose. The 
determination of an effective dose is well within the 
capability of those skilled in the art. 

25 A "therapeutically effective dose" refers to that 

amount of active ingredient — for example MDZ3, MDZ4, MDZ7 
or MDZ12 protein, fusion protein, or fragments thereof, 
antibodies specific for MDZ3, MDZ4, MDZ7 or MDZ12, agonists, 
antagonists or inhibitors of MDZ3, MDZ4, MDZ7 or MDZ12 — 

30 which ameliorates the signs or symptoms of the disease or 

prevents progression thereof; as would be understood in the 
medical arts, cure, although desired, is not required. 
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The therapeutically effective dose of the 
pharmaceutical agents of the present invention can be 
estimated initially by in vitro tests, such as cell culture 
assays, followed by assay in model animals, usually mice, 
5 rats, rabbits, dogs, or pigs. The animal model can also be 
used to determine an initial useful concentration range and 
route of administration* 

For example, the ED50 (the dose therapeutically 
effective in 50% of the population) and LD50 (the dose 

10 lethal to 50% of the population) can be determined in one or 
more cell culture of animal model systems. The dose ratio 
of toxic to therapeutic effects is the therapeutic index, 
which can be expressed as LD50/ED50. Pharmaceutical 
compositions that exhibit large therapeutic indices are 

15 particularly useful. 

The data obtained from cell culture assays and 
animal studies is used in formulating an initial dosage 
range for human use, and preferably provides a range of 
circulating concentrations that includes the ED50 with 

20 little or no toxicity. After administration, or between 

successive administrations, the circulating concentration of 
active agent varies within this range depending upon 
pharmacokinetic factors well known in the art, such as the 
dosage form employed, sensitivity of the patient, and the 

25 route of administration. 

The exact dosage will be determined by the 
practitioner, in light of factors specific to the subject 
requiring treatment. Factors that can be taken into account 
by the practitioner include the severity of the disease 

30 state, general health of the subject, age, weight, gender of 
the subject, diet, time and frequency of administration, 
drug combination (s) , reaction sensitivities, and 
tolerance/response to therapy. Long-acting pharmaceutical 
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compositions can be administered every 3 to 4 days, every 
week, or once every two weeks depending on half-life and 
clearance rate of the particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 
5 micrograms, up to a total dose of about 1 g, depending upon 
the route of administration. Where the therapeutic agent is 
a protein or antibody of the present invention, the 
therapeutic protein or antibody agent typically is 
administered at a daily dosage of 0.01 mg to 30 mg/kg of 
10 body weight of the patient (e.g., lmg/kg to 5 mg/kg). The 
pharmaceutical formulation can be administered in multiple 
doses per day, if desired, to achieve the total desired 
daily dose. 

Guidance as to particular dosages and methods of 
15 delivery is provided in the literature and generally 

available to practitioners in the art. Those skilled in the 
art will employ different formulations for nucleotides than 
for proteins or their inhibitors. Similarly, delivery of 
polynucleotides or polypeptides will be specific to 
20 particular cells, conditions, locations, etc. 

Conventional methods, known to those of ordinary 
skill in the art of medicine, can be used to administer the 
pharmaceutical formulation (s ) of the present invention to 
the patient. The pharmaceutical compositions of the present 
25 invention can be administered alone, or in combination with 
other therapeutic agents or interventions. 

THERAPEUTIC METHODS 

30 The present invention further provides methods of 

treating subjects having defects in MDZ3, MDZ4, MDZ7 or 
MDZ12 — e.g., in expression, activity, distribution, 
localization, and/or solubility of MDZ3, MDZ4 , MDZ7 or MDZ12 
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— which can manifest as a disorder of brain, testis, heart 
or bone marrow function for MDZ3; bone marrow, brain, heart, 
hela, adult liver, fetal liver, lung, placenta and prostate 
function for MDZ4; testes function for MDZ7; and brain, 
5 heart, kidney, placenta, skeletal muscle, testis, bone 
marrow or liver function for MDZ12. As used herein, 
"treating" includes all medically-acceptable types of 
therapeutic intervention, including palliation and 
prophylaxis (prevention) of disease. 

10 In one embodiment of the therapeutic methods of 

the present invention, a therapeutically effective amount of 
a pharmaceutical composition comprising MDZ3, MDZ4, MDZ7 or 
MDZ12 protein, fusion, fragment or derivative thereof is 
administered to a subject with a clinically-significant 

15 MDZ3, MDZ4, MDZ7 or MDZ12 defect. 

Protein compositions are administered, for 
example, to complement a deficiency in native MDZ3, MDZ4, 
MDZ7 or MDZ12, respectively. In other embodiments, protein 
compositions are administered as a vaccine to elicit a 

2 0 humoral and/or cellular immune response to MDZ3, MDZ4, MDZ7 
or MDZ12, respectively. The immune response can be used to 
modulate activity of MDZ3, MDZ4, MDZ7 or MDZ12 , 
respectively, or, depending on the immunogen, to immunize 
against aberrant or aberrantly expressed forms, such as 

25 mutant or inappropriately expressed isoforms. In yet other 
embodiments, protein fusions having a toxic moiety are 
administered to ablate cells that aberrantly accumulate 
MDZ3, MDZ4, MDZ7 or MDZ12, respectively. 

In another embodiment of the therapeutic methods 

30 of the present invention, a therapeutically effective amount 
of a pharmaceutical composition comprising nucleic acid of 
the present invention is administered. The nucleic acid can 
be delivered in a vector that drives expression of MDZ3, 
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MDZ4, MDZ7 or MDZ12 protein, fusion, or fragment thereof, or 
without such vector. 

Nucleic acid compositions that can drive 
expression of MDZ3, MDZ4, MDZ7 or MDZ12 are administered, 
5 for example, to complement a deficiency in native MDZ3, 

MDZ4, MDZ7 or MDZ12, or as DNA vaccines. Expression vectors 
derived from virus, replication deficient retroviruses, 
adenovirus, adeno-associated (AAV) virus, herpes virus, or 
vaccinia virus can be used — see, e.g., Cid-Arregui (ed.), 

10 Viral Vectors: Basic Science and Gene Therapy , Eaton 

Publishing Co., 2000 (ISBN: 188129935X) - as can plasmids . 

Antisense nucleic acid compositions, or vectors 
that drive expression of MDZ3, MDZ4, MDZ7 or MDZ12 antisense 
nucleic acids, are administered to downregulate 

15 transcription and/or translation of MDZ3, MDZ4, MDZ7 or 
MDZ12 in circumstances in which excessive production, or 
production of aberrant protein, is the pathophysiologic 
basis of disease. 

Antisense compositions useful in therapy can have 

2 0 sequence that is complementary to coding or to noncoding 
regions of the MDZ3, MDZ4, MDZ7 or MDZ12 genes, 
respectively. For example, oligonucleotides derived from 
the transcription initiation site, e.g., between positions 
-10 and +10 from the start site, are particularly useful. 

25 Catalytic antisense compositions, such as 

ribozymes, that are capable of sequence-specific 
hybridization to MDZ3, MDZ4, MDZ7 or MDZ12 transcripts, are 
also useful in therapy. See, e.g., Phylactou, Adv. Drug 
Deliv. Rev. 44 (2-3) : 97-108 (2000); Phylactou et al . , Hum. 

30 Mol. Genet. 7 ( 10 ): 1649-53 (1998); Rossi, Cijba Found. Symp. 
209:195-204 (1997); and Sigurdsson et al., Trends 
Biotechnol. 13(8):286-9 (1995), the disclosures of which are 
incorporated herein by reference in their entireties. 



- 169 - 



Other nucleic acids useful in the therapeutic 
methods of the present invention are those that are capable 
of triplex helix formation in or near the MDZ3, MDZ4, MDZ7 
or MDZ12 genomic locus, respectively. Such triplexing 
5 oligonucleotides are able to inhibit transcription, Intody 
et al. f Nucleic Acids Res. 28 (21 ): 4283-90 (2000); McGuffie 
et al., Cancer Res. 60 (14 ): 3790-9 (2000), the disclosures of 
which are incorporated herein by reference, and 
pharmaceutical compositions comprising such triplex forming 

10 oligos (TFOs) are administered in circumstances in which 

excessive production, or production of aberrant protein, is 
a pathophysiologic basis of disease. 

In another embodiment of the therapeutic methods 
of the present invention, a therapeutically effective amount 

15 of a pharmaceutical composition comprising an antibody 

(including fragment or derivative thereof) of the present 
invention is administered. As is well known, antibody 
compositions are administered, for example, to antagonize 
activity of MDZ3, MDZ4, MDZ7 or MDZ12, respectively, or to 

20 target therapeutic agents to sites of MDZ3, MDZ4, MDZ7 or 
MDZ12 presence and/or accumulation. 

In another embodiment of the therapeutic methods 
of the present invention, a pharmaceutical composition 
comprising a non-antibody antagonist of MDZ3, MDZ4, MDZ7 or 

25 MDZ12 is administered. Antagonists of MDZ3, MDZ4, MDZ7 or 
MDZ12 can be produced using methods generally known in the 
art. In particular, purified MDZ3, MDZ4, MDZ7 or MDZ12 
(MDZ12a, MDZ12bS, or MDZ12bL) can be used to screen 
libraries of pharmaceutical agents, often combinatorial 

30 libraries of small molecules, to identify those that 

specifically bind and antagonize at least one activity of 
MDZ3, MDZ4, MDZ7 or MDZ12, respectively. 
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In other embodiments a pharmaceutical composition 
comprising an agonist of MDZ3, MDZ4 , MDZ7 or MDZ12 is 
administered. Agonists can be identified using methods 
analogous to those used to identify antagonists, 
5 In still other therapeutic methods of the present 

invention, pharmaceutical compositions comprising host cells 
that express MDZ3, MDZ4, MDZ7 or MDZ12, fusions, or 
fragments thereof can be administered. In such cases, the 
cells are typically autologous, so as to circumvent 
10 xenogeneic or allotypic rejection, and are administered to 
complement defects in MDZ3, MDZ4, MDZ7 or MDZ12 production 
or activity. 

In other embodiments, pharmaceutical compositions 
comprising the MDZ3, MDZ4, MDZ7 or MDZ12 proteins, nucleic 

15 acids, antibodies, antagonists, and agonists of the present 
invention can be administered in combination with other 
appropriate therapeutic agents. Selection of the 
appropriate agents for use in combination therapy can be 
made by one of ordinary skill in the art according to 

20 conventional pharmaceutical principles. The combination of 
therapeutic agents or approaches can act additively or 
synergistically to effect the treatment or prevention of the 
various disorders described above, providing greater 
therapeutic efficacy and/or permitting use of the 

25 pharmaceutical compositions of the present invention using 
lower dosages, reducing the potential for adverse side 
effects . 

TRANSGENIC ANIMALS AND CELLS 

30 



In another aspect, the invention provides 
transgenic cells and non-human organisms comprising MDZ3, 
MDZ4, MDZ7 or MDZ12 isoform nucleic acids, and transgenic 
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cells and non-human organisms with targeted disruption of 
the endogenous orthologue of the human MDZ3, MDZ4, MDZ7 or 
MDZ12 gene, respectively. 

The cells can be embryonic stem cells or somatic 
5 cells. The transgenic non-human organisms can be chimeric, 
nonchimeric heterozygotes, and nonchimeric homozygotes. 

DIAGNOSTIC METHODS 

10 The nucleic acids of the present invention can be 

used as nucleic acid probes to assess the levels of MDZ3, 
MDZ4, MDZ7 , MDZ12a or MDZ12b mRNA in disease tissues or 
cells, and antibodies of the present invention can be used 
to assess the expression levels of MDZ3, MDZ4, MDZ7, MDZ12a 

15 or MDZ12b proteins in disease tissues or cells to diagnose a 
variety of diseases, including developmental disorders and 
cancer . 

The following examples are offered for purpose of 
20 illustration, not limitation* 

EXAMPLE 1 

Identification and Characterization of 
cDNAs Encoding MDZ3 Proteins 

25 

Bioinf ormatic algorithms were applied to human 
genomic sequence data to identify putative exons . Using a 
graphical display particularly designed to facilitate 
computerized query of the resulting exon, eight exons were 
30 identified as belonging to the same gene. 

Marathon-Ready™ HeLa cell cDNA (Clontech 
Laboratories, Palo Alto, CA) was used as a substrate for 
standard RACE (rapid amplification of cDNA ends) . Marathon- 
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Ready cDNAs are adaptor-ligated double stranded cDNAs 
suitable for 3' and 5 ! RACE, Chenchik et al . , BioTechniques 
21:526-532 (1996); Chenchik et al . , CL ONTE CHn iques X(l):5-8 
(January 1995) . RACE techniques are described, inter alia, 

TM 

5 in the Marathon-Ready cDNA User Manual (Clontech Labs., 
Palo Alto, CA, USA, March 30, 2000, Part No. PT1156-1 
(PR03517)), Ausubel et al . (eds.), Short Protocols in 
Molecular Biology : A Compendium of Methods from Current 
Protocols in Molecular Biology , 4 th edition (April 1999), 

10 John Wiley & Sons (ISBN: 047132938X) and Sambrook et al . 
(eds.), Molecular Cloning: A Laboratory Manual (3rd ed.), 
Cold Spring Harbor Laboratory Press (2000) (ISBN: 
0879695773), the disclosures of which are incorporated 
herein by reference in their entireties. 

15 Two overlapping RACE products were cloned that 

together span 2.0 kilobases and that together appear to 
contain the entire coding region of the gene to which the 
exons contribute; for reasons described below, we termed 
this cDNA MDZ3. 

2 0 The MDZ3 cDNA was sequenced on both strands using 

a MegaBace™ sequencer (Molecular Dynamics, Inc., Sunnyvale, 
CA, USA) . Sequencing both strands provided us with the 
exact chemical structure of the cDNA, which is shown in FIG . 
3 and further presented in the SEQUENCE LISTING as SEQ ID 

25 NO: 1, and placed us in actual physical possession of the 
entire set of single-base incremented fragments of the 
sequenced clone, starting at the 5 1 and 3 T termini. 

MDZ3 cDNA was sent to the American Type Culture 
Collection (ATCC) for deposit on August 1, 2001, received by 

30 ATCC on August 2, 2001, and accorded accession number 
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As shown in FIG. 3, the MDZ3 cDNA spans 1981 
nucleotides and contains an open reading frame from 
nucleotide 311 through and including nt 1945 (inclusive of 
termination codon) , predicting a protein of 544 amino acids 
with a (posttranslationally unmodified) molecular weight of 
61.4 kD. The open reading frame appears full length with 
in-frame 5 ? stop codons, a methionine start codon and a stop 
codon. 

BLAST query of genomic sequence identified one 
BAC, spanning 14 kb, that constitute the minimum set of 
clones encompassing the cDNA sequence. Based upon the known 
origin of the BAC (GenBank accession number AC005020.5), the 
MDZ3 gene can be mapped to human chromosome 7q22.1. 

Comparison of the cDNA and genomic sequences 
identified 8 exons . Exon organization is listed in Table 1. 







; : Tabl4 I ■ '■ ■ ■ - \ 
■ Exon .Struqtiir^ \ 




Exon 
no . 


cDNA range 


genomic range 


BAC 

accession 


1 


1-52 


78844-78793 


AC005020. 5 


2 


53-179 


77284-77158 




3 


180-264 


76761-76677 




4 


265-697 


76248-75816 




5 


698-899 


74436-74235 




6 


900-991 


73260-73169 




7 


992-1115 


71752-71629 




8 


1116-1981 


66618-65753 





FIG. 2 schematizes the exon organization of the 
MDZ3 clone. 



- 174 - 



At the top is shown the bacterial artificial 
chromosome (BAC) , with GenBank accession number, that spans 
the MDZ3 locus. 

As shown in FIG. 2, MDZ3, encoding a protein of 
5 544 amino acids, is comprised of exons 1-8. Predicted 
molecular weight of the protein, prior to any post- 
translational modification, is 61.4 kD. 

As further discussed in the examples herein, 
expression of MDZ3 was assessed using RT-PCR. RT-PCR product 

10 for MDZ3 was clearly produced from brain, testis, heart and 
bone marrow, but not from lung, liver, or skeletal muscle. 

The sequence of the MDZ3 cDNA was used as a BLAST 
query into the GenBank nr and dbEst databases. The nr 
database includes all non-redundant GenBank coding sequence 

15 translations, sequences derived from the 3-dimensional 
structures in the Brookhaven Protein Data Bank (PDB) , 
sequences from SwissProt, sequences from the protein 
information resource (PIR) , and sequences from protein 
research foundation (PRF) . The dbEst (database of expressed 

20 sequence tags) includes ESTs, short, single pass read cDNA 
(mRNA) sequences, and cDNA sequences from differential 
display experiments and RACE experiments. BLAST search 
identified two human ESTs (BE540085, AW892105) , multiple 
mouse and rat ESTs, one EST from bovine (BE750886) and one 

25 from pig (BF079982) as having sequence closely related to 
MDZ3. 

Globally, human MZD3 resembles a 604 amino acid 
residue hypothetical zinc finger protein, KIAA0426, at 36 % 
amino acid identity and 47 % amino acid similarity over 560 
30 amino acids. 

Motif searches using Pfam (http : //pf am. wustl . 
edu) , SMART (http://smart.embl- heidelberg.de), and PROSITE 
pattern and profile databases (http: //www. 
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expasy . ch/prosite) , identified several known domains shared 
with SCAN box containing Kruppel family zinc-finger 
proteins . 

FIG. 1 shows the domain structure of MDZ3 and the 
5 alignment of SCAN box and KRAB domain in MDZ3 with similar 
motifs . 

As schematized in FIG . 1, the newly isolated gene 
product shares certain protein domains and an overall 
structural organization with SCAN box- containing Kruppel 

10 family zinc-finger proteins. The shared structural features 
strongly imply that MDZ3 plays a role similar to that of 
other SCAN box- containing Kruppel family zinc-finger 
proteins as regulators of gene expression, and participates 
in protein-protein interactions with other transcription 

15 modulators. Thus MDZ3 is a clinically useful diagnostic 
marker and potential therapeutic agent for a ranges of 
diseases, including developmental disorders and cancer. 

Possession of the genomic sequence permitted 
search for promoter and other control sequences for the MDZ3 

2 0 gene. 

A putative transcriptional control region, 
inclusive of promoter and downstream elements, was defined 
as 1 kb around the transcription start site, itself defined 
as the first nucleotide of the MDZ3 cDNA clone. The region, 

25 drawn from sequence of BAC AC005020.5, has the sequence 

given in SEQ ID NO: 24, which lists 1000 nucleotides before 
the transcription start site. 

Transcription factor binding sites were identified 
using a web based program (http: //motif . genome.ad.jp/), 

30 including two binding site for sex-determining region Y gene 
product (SRY, 474-480 bp and 482-488), for myoblast 
determining factor (961-970 bp), and for homeo domain factor 
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Nkx-2.5/Csx, tinman homolog (589-597 bp, with numbering 
according to SEQ ID NO: 24), amongst others. 

We have thus identified a newly described human 
gene, MDZ3, which shares certain protein domains and an 
overall structural organization with SCAN box containing 
Kruppel family zinc-finger proteins. The shared structural 
features strongly imply that the MDZ3 protein plays a role 
similar to the other SCAN box containing Kruppel family 
zinc-finger proteins. It likely functions as a regulator of 
gene expression and participates in protein-protein 
interactions with other transcription modulators; thus, the 
MDZ3 proteins and nucleic acids are clinically useful 
diagnostic markers and potential therapeutic agents for a 
range of diseases, including developmental disorders and 
cancer . 



EXAMPLE 2 

Identification and Characterization of 
cDNAs Encoding MDZ4 Proteins 



Predicating our gene discovery efforts on use of 
genome-derived single exon probes and hybridization to 
genome-derived single exon microarrays — an approach that we 
have previously demonstrated will readily identify novel 
genes that have proven refractory to mRNA-based 
identification efforts — we identified an exon in raw human 
genomic sequence that is particularly expressed in human 
bone marrow, brain, heart, HeLa, adult liver, fetal liver, 
lung, placenta and prostate. 

Briefly, bioinf ormatic algorithms were applied to 
human genomic sequence data to identify putative exons . 
Each of the predicted exons was amplified from genomic DNA, 
typically centering the putative coding sequence within a 
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larger amplicon that included flanking noncoding sequence. 
These genome-derived single exon probes were arrayed on a 
support and expression of the bioinf ormatically predicted 
exons assessed through a series of simultaneous two-color 
5 hybridizations to the genome-derived single exon 
microarrays * 

The approach and procedures are further described 
in detail in Penn et al., "Mining the Human Genome using 
Microarrays of Open Reading Frames," Nature Genetics 26:315- 

10 318 (2000); commonly owned and copending U.S. patent 

application nos. 09/864,761, filed May 23, 2001, 09/774,203, 
filed January 29, 2001, and 09/632,366, filed August 3, 
2000, the disclosures of which are incorporated herein by 
reference in their entireties. 

15 Using a graphical display particularly designed to 

facilitate computerized query of the resulting exon-specif ic 
expression data, as further described in commonly owned and 
copending U.S. patent application commonly owned and 
copending U.S. patent application nos. 09/864,761, filed May 

20 23, 2001, and 09/774,203, filed January 29, 2001, the 

disclosures of which are incorporated herein by reference in 
their entireties, two exons were identified that are 
expressed in all the human tissues tested; subsequent 
analysis revealed that the two exons belong to the same 

25 gene. 

Table 2 summarizes the microarray expression data 
obtained using genome-derived single exon probes 
corresponding to exons 2 and 3. Each probe was completely 
sequenced on both strands prior to its use on a genome- 
30 derived single exon microarray; sequencing confirmed the 

exact chemical structure of each probe. An added benefit of 
sequencing is that it placed us in possession of a set of 
single base-incremented fragments of the sequenced nucleic 
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acid, starting from the sequencing primer's 3 f OH. (Since 
the single exon probes were first obtained by PCR 
amplification from genomic DNA, we were of course 
additionally in possession of an even larger set of single 
5 base incremented fragments of each of the single exon 

probes , each fragment corresponding to an extension product 
from one of the two amplification primers.) 

Signals and expression ratios are normalized 
values measured and calculated as further described in 
10 commonly owned and copending U.S. patent application nos. 
09/864,761, filed May 23, 2001, 09/774,203, filed January 
29, 2001, and 09/632,366, filed August 3, 2000, the 
disclosures of which are incorporated herein by reference in 
their entireties. 



; :• *; ■ i _ Tgbls 2: . ! ; .\ . ... j • >- . ■ - \ 
\ \ \ .". ' \ \ Expression &ti&lysih '■: ' ' •:" f-f \ " ' ' 
' ; Genoi^-Deriyed Single • Exon l^icroarkay 




Ampl_318 08 (exon_2) 


Ampl__9581 
(exon 3) 




Signal 


Expression ratio 


Signal 


Expression ratio 


ADULT 
LIVER 


n/d 


n/d 


1.4 


-1.04 


BONE 
MARROW 


1.32 


-1. 69 


n/d 


n/d 


BRAIN 


1.43 


-1. 63 


1.75 


n/d 


FETAL 
LIVER 


2.17 


-1. 11 


1.07 


-1.16 


HEART 


1.78 


-1.22 


1.22 


-1. 07 


HE LA 


2.04 


-1.12 


1. 75 


1. 01 


LUNG 


0.78 


n/d 


n/d 


n/d 


PLACENTA 


1.23 


-2.54 


1.61 


1 


PROSTATE 


1.55 


n/d 


n/d 


n/d 



As shown in Table 2, significant expression of 
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exons 2 and 3 was seen only in human bone marrow, brain, 
heart, hela, adult liver, fetal liver, lung, placenta and 
prostate . 

Marathon-Ready™ placenta cDNA (Clontech 
5 Laboratories, Palo Alto, CA, USA) was used as a substrate 
for standard RACE (rapid amplification of cDNA ends) to 
obtain a cDNA clone that spans 1.3 kilobases and appears to 
contain the entire coding region of the gene to which the 
exon contributes; for reasons described below, we termed 

10 this cDNA MDZ4. Marathon-Ready™ cDNAs are adaptor-ligated 
double stranded cDNAs suitable for 3 P and 5 T RACE. Chenchik 
et al., BioTechniques 21:526-532 (1996); Chenchik et al., 
CLONTECHni qu e s X(l):5-8 (January 1995). RACE techniques are 
described, inter alia, in the Marathon-Ready™ cDNA User 

15 Manual (Clontech Labs., Palo Alto, CA, USA, March 30, 2000, 
Part No. PT1156-1 (PR03517) ) , Ausubel et al . (eds.), Short 
Protocols in Molecular Biology : A Compendium of Methods 
from Current Protocols in Molecular Biology , 4 th edition 
(April 1999), John Wiley & Sons (ISBN: 047132938X) and 

20 Sambrook et al . (eds.), Molecular Cloning: A Laboratory 

Manual (3rd ed.), Cold Spring Harbor Laboratory Press (2000) 
(ISBN: 0879695773), the disclosures of which are 
incorporated herein by reference in their entireties. 

The MDZ4 cDNA was sequenced on both strands using 

25 a MegaBace™ sequencer (Molecular Dynamics, Inc., Sunnyvale, 
CA, USA) . Sequencing both strands provided us with the 
exact chemical structure of the cDNA, which is shown in FIG. 
6 and further presented in the SEQUENCE LISTING as SEQ ID 
NO: 3027, and placed us in actual physical possession of the 

30 entire set of single-base incremented fragments of the 
sequenced clone, starting at the 5 T and 3 T termini. 

MDZ4 cDNA was sent for deposit to the American 
Type Culture Collection on August 1, 2001, received at ATCC 
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on August 2, 2001, and accorded accession number 



As shown in FIG. 6, the MDZ4 cDNA spans 1329 
nucleotides and contains an open reading frame from 
5 nucleotide 142 through and including nt 1311 (inclusive of 
termination codon) , predicting a protein of 389 amino acids 
with a (posttranslationally unmodified) molecular weight of 
44.9 kD. The clone appears full length, with the reading 
frame opening with a methionine and terminating with a stop 
10 codon. 

BLAST query of genomic sequence identified one 
PAC, spanning 128 kb, that constitutes the minimum set of 
clones encompassing the cDNA sequence. Based upon the known 
origin of the PAC (GenBank accession numbers Z98745.1), the 
15 MDZ4 gene can be mapped to human chromosome 6p21,3-22.2. 

Comparison of the cDNA and genomic sequences 
identified 4 exons. Exon organization is listed in Table 3. 





• : Table ,;3 : , . : ;[ ;■ : - : ; . ,'• .: ■!■■ ■} ■■ ■■■ { 
\ Exon / ; 3t rue tor« ?^. ! ; ;. : --^- ^ ; , ■ > f //\y . 


Exon 
no . 


cDNA range 


genomic range 


PAC 
accession 


1 


1-64 


62122-62059 


Z98745. 1 


2 


65-549 


55003-54519 




3 


550-697 


54267-54120 




4 


698-1328 


53738-53107 





20 

FIG. 5 schematizes the exon organization of the 
MD24 clone. 
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At the top is shown the PI artificial chromosome 
(PAC) , with GenBank accession number, that span the MDZ4 
locus. The genome-derived single-exon probe first used to 
demonstrate expression from this locus is shown below the 
5 PAC and is labeled "500"* The 500 bp probe includes 

sequence drawn from exon two as well as flanking intron two. 

As shown in FIG. 5, MDZ4, encoding a protein of 
389 amino acids, comprising exons 1-4. Predicted 
molecular weight, prior to any post-translational 
10 modification, is 44.9 kD. 

As further discussed in the examples herein, 
expression of MDZ4 was assessed using hybridization to 
genome-derived single exon microarrays. Microarray analysis 
of exons 2 and 3 showed expression in all tissues tested, 
15 namely, in human bone marrow, brain, heart, hela, adult 
liver, fetal liver, lung, placenta and prostate. 

The sequence of the MDZ4 cDNA was used as a BLAST 
query into the GenBank nr and dbEst databases. The nr 
database includes all non-redundant GenBank coding sequence 
20 translations, sequences derived from the 3-dimensional 
structures in the Brookhaven Protein Data Bank (PDB) , 
sequences from SwissProt, sequences from the protein 
information resource (PIR) , and sequences from protein 
research foundation (PRF) . The dbEst (database of expressed 
25 sequence tags) includes ESTs, short, single pass read cDNA 
(mRNA) sequences, and cDNA sequences from differential 
display experiments and RACE experiments. BLAST search 
identified one human EST (BF698315) that skips exon 2, and 
two ESTs from pig (BF079982, BE233395) as having sequence 
30 closely related to MDZ4 . 

Globally, human MDZ4 resembles a family of SCAN 
box containing Kruppel family zinc-finger proteins, 
including ZNF165 protein (GenBank Accession number: P49910; 
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40 % amino acid identity and 55 % amino acid similarity over 
447 a. a.)/ ZNF193 protein (GenBank Accession number: 015535; 
54 % amino acid identity and 68 % amino acid similarity over 
387 a- a.) and ZNF232 protein (GenBank Accession number: 
5 Q9UNY5; 47 % amino acid identity and 59 % amino acid 
similarity over 387 a. a.). 

Motif searches using Pfam (http://pfam.wustl.edu), 
SMART (http://smart.embl-heidelberg.de), and PROSITE pattern 
and profile databases (http://www.expasy.ch/prosite), 

10 identified several known domains shared with SCAN box 
containing Kruppel family zinc-finger proteins. 

FIG. 4 shows the domain structure of MDZ4, 
including the overall structure of MDZ4 and the alignment of 
the SCAN box in MDZ4 with similar motifs. 

15 As schematized in FIG. 4, the newly isolated gene 

product shares certain protein domains and an overall 
structural organization with SCAN box containing Kruppel 
family zinc-finger proteins. The shared structural features 
strongly imply that MDZ4 plays a role similar to that of 

20 SCAN box containing Kruppel family zinc-finger proteins as a 
potential transcription regulator, and is likely to 
participate in protein-protein interactions with other 
transcription modulators. Thus, MDZ4 is a clinically useful 
diagnostic markers and potential therapeutic agents for a 

25 variety of diseases, including developmental disorders and 
cancer . 

Possession of the genomic sequence permitted 
search for promoter and other control sequences for the MDZ4 
gene . 

30 A putative transcriptional control region, 

inclusive of promoter and downstream elements, was defined 
as 1 kb around the transcription start site, itself defined 
as the first nucleotide of the MDZ4 cDNA clone. The region, 
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drawn from sequence of PAC Z98745.1, has the sequence given 
in SEQ ID NO: 3046, which lists 1000 nucleotides before the 
transcription start site. 

Transcription factor binding sites were identified 
5 using a web based program (http: //motif . genome . ad, jp/ ) , 

including a binding site for CdxA (771-777 bp) and for cap 
signal for transcription initiation (984-991 bp, with 
numbering according to SEQ ID NO: 3046) , amongst others. 

We have thus identified a newly described human 

10 gene, MDZ4, which shares certain protein domains and an 
overall structural organization with SCAN box containing 
Kruppel family zinc-finger proteins. The shared structural 
features strongly imply that the MDZ4 protein plays a role 
similar to SCAN box containing Kruppel family zinc-finger 

15 proteins, as a potential transcription regulator, and is 

likely to participate in protein-protein interactions with 
other transcription modulators. Thus MDZ4 nucleic acids and 
proetins are clinically useful diagnostic markers and 
potential therapeutic agents for a variety of diseases, 

20 including developmental disorders and cancer, 

EXAMPLE 3 

Identification and Characterization of 
cDNAs Encoding MDZ7 Proteins 

25 

Bioinf ormatic algorithms were applied to human 
genomic sequence data to identify putative exons . Using a 
graphical display particularly designed to facilitate 
computerized query of the resulting exon, four exons were 
30 identified as belonging to the same gene. 

Marathon-Ready™ placenta cDNA (Clontech 
Laboratories, Palo Alto, CA, USA) was used as a substrate 
for standard RACE (rapid amplification of cDNA ends) . 
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Marathon-Ready cDNAs are adaptor-ligated double stranded 
cDNAs suitable for 3 f and 5' RACE. Chenchik et al . , 
BioTechniques 21:526-532 (1996); Chenchik et al., 
CLONTECHniques X(l):5-8 (January 1995). RACE techniques are 
5 described, inter alia, in the Marathon-Ready™ cDNA User 

Manual (Clontech Labs., Palo Alto, CA, USA, March 30, 2000, 
Part No. PT1156-1 (PR03517) ) , Ausubel et al . (eds.), Short 
Protocols in Molecular Biology : A Compendium of Methods 
from Current Protocols in Molecular Biology , 4 th edition 

10 (April 1999), John Wiley & Sons (ISBN: 047132938X) and 
Sambrook et al. (eds.), Molecular Cloning: A Labor atory 
Manual (3rd ed.), Cold Spring Harbor Laboratory Press (2000) 
(ISBN: 0879695773), the disclosures of which are 
incorporated herein by reference in their entireties. 

15 Three overlapping RACE products were cloned that 

together contain the complete sequence of MDZ7, a cDNA clone 
that collectively spans 2.2 kilobases and appears to contain 
the entire coding region of the gene to which the exon 
contributes; for reasons described below, we termed this 

2 0 cDNA MDZ7 . 

The MDZ7 cDNA was sequenced on both strands using 
a MegaBace™ sequencer (Molecular Dynamics, Inc., Sunnyvale, 
CA, USA) . Sequencing both strands provided us with the 
exact chemical structure of the cDNA, which is shown in FIG. 

25 9 and further presented in the SEQUENCE LISTING as SEQ ID 

NO: 4407, and placed us in actual physical possession of the 
entire set of single-base incremented fragments of the 
sequenced clone, starting at the 5 1 and 3 1 termini. 

In order to assess expression in a variety of 

30 tissues, we generated a pair of PCR primers to analyze the 

expression pattern of the human MDZ7 gene in standard RT-PCR 
experiments (Sambrook et al., Molecular cloning: 3 rd 
edition, 2001) . RT-PCR product for MDZ7 was produced from 
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testis. These experiments placed us in possession of a near 
complete set of fragments of the template. 

MDZ7 cDNA was sent for deposit to the American 
Type Culture Collection on August 1, 2001, received at ATCC 
5 August 2, 2001, and accorded accession number . 

As shown in FIG. 9, the MDZ7 cDNA spans 2198 
nucleotides and contains an open reading frame from 
nucleotide 663 through and including nt 1409 (inclusive of 
termination codon) , predicting a protein of 248 amino acids 
10 with a (posttranslationally unmodified) molecular weight of 
28.8 kD. The open reading frame appears full length with an 
in-frame 5 T stop codon, a methionine start codon and a stop 
codon before a 3 T poly-A tail. 

BLAST query of genomic sequence identified one 
15 BAC, spanning 121 kb, that constitute the minimum set of 

clones encompassing the cDNA sequence. Based upon the known 
origin of the BAC (GenBank accession numbers AC002310.1), 
the MDZ7 gene can be mapped to human chromosome 16pll.2. 

Comparison of the cDNA and genomic sequences 
20 identified 4 exons . Exon organization is listed in Table 4. 







Eexon 
no . 


cDNA range 


genomic range 


PAC 
accession 


1 


1-396 


1531-1926 


AC002310.1 


2 


397-525 


2497-2625 




3 


526-1458 


4304-5236 




4 


1459-2198 


6181-6920 
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FIG. 8 schematizes the exon organization of the 
MDZ7 clone. 

At the top is shown the bacterial artificial 
5 chromosome (BAC) , with GenBank accession numbers, that spans 
the MDZ7 locus. As shown in FIG . 8, MDZ7 is comprised of 
four exons and encodes a protein of 24 8 amino acids. 
Predicted molecular weight of the MDZ7 protein, prior to any 
post-translational modification, is 28.8 kD. 

10 As further discussed in the examples herein, 

expression of MDZ7 was assessed using RT-PCR. RT-PCR 
analysis of showed MDZ7 expression only in testes, but not 
in brain, lung, liver, kidney, keletal muscle, heart, whole 
fetus, or Hela cells. 

15 The sequence of the MDZ7 cDNA was used as a BLAST 

query into the GenBank nr and dbEst databases. The nr 
database includes all non-redundant GenBank coding sequence 
translations, sequences derived from the 3-dimensional 
structures in the Brookhaven Protein Data Bank (PDB) , 

20 sequences from SwissProt, sequences from the protein 

information resource (PIR) , and sequences from protein 
research foundation (PRF) . The dbEst (database of expressed 
sequence tags) includes ESTs, short, single pass read cDNA 
(mRNA) sequences, and cDNA sequences from differential 

25 display experiments and RACE experiments. BLAST search 

identified multiple human ESTs as having sequence closely 
related to MDZ7 . 

Motif searches using Pfam (http://pfam.wustl.edu), 
SMART (http://smart.embl-heidelberg.de), and PROSITE pattern 

30 and profile databases (http://www.expasy.ch/prosite), 
identified several Kruppel family zinc-finger motifs. 

FIG. 7 shows the domain structure of MDZ7 . 

As schematized in FIG* 7, the newly isolated MDZ7 



- 187 - 



is mainly composed of seven tandemly arrayed Kruppel-type 
(C2H2) zinc finger repeats. Such a structure implies that 
MDZ7 is likely to function in sequence-specific DNA binding 
and impart a regulatory effect on specific gene expression. 
5 Thus MDZ7 nucleic acids and proteins are clinically useful 
diagnostic markers and potential therapeutic agents for a 
variety of diseases, including developmental disorders and 
cancer . 

Possession of the genomic sequence permitted 
10 search for promoter and other control sequences for the MDZ7 
gene . 

A putative transcriptional control region, 
inclusive of promoter and downstream elements, was defined 
as 1 kb around the transcription start site, itself defined 

15 as the first nucleotide of the MDZ7 cDNA clone. The region, 
drawn from sequence of BAC AC002310.1, has the sequence 
given in SEQ ID NO: 4420, which lists 1000 nucleotides 
before the transcription start site. 

Transcription factor binding sites were identified 

20 using a web based program (http://motif.genome.ad.jp/), 

including a binding site for cap signal for transcription 
initiation (846-853 bp, with numbering according to SEQ ID 
NO: 4420), amongst others. 

We have thus identified a newly described human 

25 gene, MDZ7, which contains seven Kruppel family zinc-finger 
motifs. The structural features strongly imply that the MDZ7 
protein plays a role similar to other Kruppel family zinc- 
finger proteins, as a potential transcription regulator. 
Thus, MDZ7 nucleic acids and proteins are clinically useful 

30 diagnostic markers and potential therapeutic agents for a 
variety of diseases, including developmental disorders and 
cancer . 
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EXAMPLE 4 

Identification and Characterization of 
cDNAs Encoding MDZ12 Proteins 



5 Bioinf ormatic algorithms were applied to human 

genomic sequence data to identify putative exons . Using a 
graphical display particularly designed to facilitate 
computerized query of the resulting exon, four exons were 
identified as belong to the same gene. 
10 To RACE out the full length MDZ12a gene, human 

heart marathon-ready cDNA (Clontech) was used as the 
template, and oligonucleotides OL612 (5'- 

T T AC T AAAT C AAAT GGGTGTTTT GAT GGC AT AAA- 3 ' ) [SEQ ID NO:7036] and 
OL613 ( 5' -GCATCAGGTGGCAAAGCTCAATCAGGACA-3 ' ) [SEQ ID NO:7037] 

15 were used to PCR out a 1.2 kb fragment of the open reading 
frame (ORF) using protocols according to the manufacturer's 
instructions (Clontech) . Marathon-Ready™ cDNAs are adaptor- 
ligated double stranded cDNAs suitable for 3 1 and 5 T RACE. 
Chenchik et al . , BioTechniques 21:526-532 (1996); Chenchik 

20 et al., CL ONTE CHn iques X(l):5-8 (January 1995). RACE 
techniques are described, inter alia, in the Marathon- 
Ready™ cDNA User Manual (Clontech Labs., Palo Alto, CA, 
USA, March 30, 2000, Part No. PT1156-1 (PR03517)), Ausubel 
et al. (eds.), Short Protocols in Molecular Biology : A 

25 Compendium of Methods from Current Protocols in Molecular 
Biology , 4 th edition (April 1999), John Wiley & Sons (ISBN: 
047132938X) and Sambrook et al. (eds.), Molecular Cloning: A 
Laboratory Manual (3rd ed. ) , Cold Spring Harbor Laboratory 
Press (2000) (ISBN: 0879695773), the disclosures of which 

30 are incorporated herein by reference in their entireties. 

Using a similar protocol, oligonucleotides OL614 
(5' -CACTCAATGCACGTATAGGGCCTCTCGCC-3 ' ) [SEQ ID NO: 7038] and 
OL615 (5' -TGTCCTGATTGAGCTTTGCCACCTGATGC-3' ) [SEQ ID NO:7039] 
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were used to PCR out the 5' end of the gene, 

oligonucleotides 0L311 (5 f -GGTCACCTGTACGCTCCTCTCCATGTCTCTTC- 
3') [SEQ ID NO:7040] and OL312 (5'- 

CTGTTTGGCTTCCGACCTGCTCCTCACC-3 ' ) [SEQ ID NO:7041] were used 
5 to PCR out the 3' end of the gene. The PCR fragments are 

sequenced using a MegaBACE™ sequencer- The final contig of 
the sequences revealed a 1.78 kb novel gene with a 1,45 kb 
ORF. 

To subclone MDZ12a into a cloning vector, the RACE 

10 product generated with oligonucleotides OL612 and OL638 (5'- 
CCACCATGTGGCTGGGGACTTCAGGGAAGAGTGGGTTAC-3' ) [SEQ ID NO: 7042] 
against human heart marathon-ready cDNA (Clontech) was 
ligated and T/A cloned into pGem-Teasy vector (Promega 
Corp.). Individual clones were picked and inserts 

15 sequenced. A second MDZ12 transcript, with the insertion of 
an extra exon with 66 nucleotides between exons 2 and 3 of 
MDZ12a, was identified. This transcript is named MDZ12b. The 
insertion of the extra exon introduces an early stop codon, 
reducing the initial ORF to 44 amino acids. However, the 

20 MDZ12b transcript does still contain another ORF downstream 
with 332 amino acids. 

The MDZ12a cDNA was sequenced on both strands 
using a MegaBace™ sequencer (Molecular Dynamics, Inc., 
Sunnyvale, CA, USA) . Sequencing both strands provided us 

25 with the exact chemical structure of the cDNA, which is 
shown in FIG. 12 and further presented in the SEQUENCE 
LISTING as SEQ ID NO: 5770, and placed us in actual physical 
possession of the entire set of single-base incremented 
fragments of the sequenced clone, starting at the 5' and 3 T 

3 0 termini. 

MDZ12a and MDZ12b cDNA were deposited together in 
a single tube at the American Type Culture Collection; the 
deposit was sent to ATCC for deposit on August 1, 2001, 



- 190 - 



received at ATCC August 2, 2001, and accorded accession 

number . 

As shown in FIG . 12, the MDZ12a cDNA spans 17 8 0 
nucleotides and contains an open reading frame from 
5 nucleotide 127 through and including nt 1578 (inclusive of 
termination codon) , predicting a protein of 483 amino acids 
with a (posttranslationally unmodified) molecular weight of 
55.1 kD. The open reading frame appears full length with 
in-frame 5 T stop codons, a methionine start codon and a stop 
10 codon. 

As shown in FIG. 13, the MDZ12b cDNA spans at 
least 1518 nucleotides and contains two open reading frames. 
The shorter ORF, MDZ12bS, is from nucleotide 1 through and 
including nt 135 (inclusive of termination codon), 

15 predicting a polypeptide of 44 amino acids. While the longer 
ORF, MDZ12bL, is from nucleotide 520 through and including 
nt 1518 (inclusive of termination codon), predicting a 
polypeptide of 332 amino acids with a (posttranslationally 
unmodified) molecular weight of 38.2 kD. 

20 BLAST query of genomic sequence identified one 

BAC, that constitute the minimum set of clone encompassing 
the cDNA sequence. Based upon the known origin of the BAC 
(GenBank accession numbers AC018946.5), the MDZ12 gene can 
be mapped to human chromosome 15q2 6.1. 

2 5 Comparison of the cDNA and genomic sequences 

identified 4 exons for MDZ12a. Exon organization is listed 
in Table 5. The additional exon found in MDZ12b is numbered 
as 2' . 



Table S : - -> ■ . " ■ "" - 

Exon Structure "of ^t.Ha ' MDZl2 :=.g;ene .. 
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Exon 
no . 


cDNA range 


genomic range 


oi-iU aCCcfaolUll 


l 


1-107 


155292-155186 


AC018946. 5 


2' 


1-66 


152246-152181 




2 


108-230 


152955-152833 




3 


231-337 


145120-145226 




4 


338-1756 


146307-147730 





FIG . 11 schematizes the exon organization of the 
MDZ12 clones. 

At the top is shown the bacterial artificial 
5 chromosome (BAC) , with GenBank accession numbers, that spans 
the MDZ12 locus. 

As shown in FIG. 11, MDZ12a encodes a protein of 
483 amino acids, comprising exons 1-4. Predicted 
molecular weight of the protein, prior to any post- 
10 translational modification, is 55.1 kD. The inclusion of a 
novel exon between exons 2 and 3 introduces an inframe stop 
codon in MDZ12b, and thus MDZ12b encodes a short polypeptide 
of 44 amino acids (MDZ12bS) . The use of an internal 
methionine as initiation methionine in MDZ12b could 
15 potentially encode a 332 amino acid protein (MDZ12bL) . The 
predicted molecular weight of the MDZ12bL protein, prior to 
any post- translational modification, is 38.2 kD. 

Expression of MDZ12 was assessed using RT-PCR. 
The abundance of PCR product indicates that MDZ12a is 
20 expressed in all tissue examined with highest expression in 
brain, heart, skeletal muscle, testis and Hela cells. 
MDZ12b, however, is expressed with lower to much lower 
abundance compared with MDZ12a in bone marrow, brain, heart, 
kidney, placenta, skeletal muscle, testis and Hela cells 
25 with almost no expression in liver. 

The sequences of the MDZ12a and MDZ12b cDNA were 
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used as BLAST queries into the GenBank nr and dbEst 
databases. The nr database includes all non-redundant 
GenBank coding sequence translations, sequences derived from 
the 3-dimensional structures in the Brookhaven Protein Data 
5 Bank (PDB) , sequences from SwissProt, sequences from the 
protein information resource (PIR) , and sequences from 
protein research foundation (PRF) . The dbEst (database of 
expressed sequence tags) includes ESTs, short, single pass 
read cDNA (mRNA) sequences, and cDNA sequences from 

10 differential display experiments and RACE experiments. BLAST 
search identified multiple human ESTs as having sequence 
closely related to MDZ12. 

Motif searches using Pfam (http://pfam.wustl.edu), 
SMART (http://smart.embl- heidelberg.de), and PROSITE 

15 pattern and profile databases 

(http://www.expasy.ch/prosite) , identified several known 
domains shared with KRAB domain containing Kruppel family 
zinc-finger proteins for MDZ12a and Kruppel family zinc- 
fingers for MDZ12bL. 

20 FIG. 10A shows the domain structure of MDZ12, 

including the overall structure of MDZ12a and MDZ12bL, with 
FIG. 10B showing the alignment of KRAB domain in MDZ12a with 
similar motifs. 

As schematized in FIGS. 10A and 10B, the newly 

25 isolated MDZ12a contains a partial KRAB motif as well as 
twelve copies of C2H2 zinc fingers. The MDZ12bL contains 
twelve copies of C2H2 zinc fingers. Such features strongly 
imply that MDZ12 plays a role as a potential transcription 
regulator, and is likely to participate in protein-protein 

30 interactions with other transcription modulators. Thus MDZ12 
nucleic acids and proteins are clinically useful diagnostic 
markers and potential therapeutic agents for a variety of 
diseases, including developmental disorders and cancer. 
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Possession of the genomic sequence permitted 
search for promoter and other control sequences for the 
MDZ12 gene. 

A putative transcriptional control region, 
5 inclusive of promoter and downstream elements, was defined 
as 1 kb around the transcription start site, itself defined 
as the first nucleotide of the MDZ12 cDNA clone. The 
region, drawn from sequence of BAC AC01894 6.5, has the 
sequence given in SEQ ID NO: 5783, which lists 1000 

10 nucleotides before the transcription start site. 

Transcription factor binding sites were identified 
using a web based program (http://motif.genome.ad.jp/), 
including a binding site for signal transducers and 
activators of transcription (STATx, 837-845 bp) and for 

15 GATA-binding factor 1 ( GAT A- 1 , 926-935 bp), with numbering 
according to SEQ ID NO: 5783), amongst others. 

We have thus identified a newly described human 
gene, MDZ12, which shares certain protein domains and an 
overall structural organization with KRAB box containing 

20 Kruppel family zinc-finger proteins. The shared structural 

features strongly imply that the MDZ12a and MDZ12bL proteins 
play a role similar to KRAB box containing Kruppel family 
zinc-finger proteins, as a potential transcription 
regulator, and is likely to participate in protein-protein 

25 interactions with other transcription modulators. Thus MDZ12 
is a clinically useful diagnostic marker and potential 
therapeutic agent for a variety of diseases, including 
developmental disorders and cancer. 

30 EXAMPLE 5 

RT-PCR analysis of expression of human MDZ3 



RT-PCR analysis was used to determine the 
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expression pattern of the human MDZ3 gene. A forward primer 
(5 f -GATGGCGGAAGCTCCTCAGC) [SEQ ID NO: 7043] and a reverse 
primer ( 5' -AGTCCTGCGGTTCCACATAC) [SEQ ID NO: 7 044] derived 
from the middle of the MDZ3 gene were used in standard RT- 
5 PCRs (Sambrook et al . , Molecular Cloning (3rd ed.), 2001). 
Templates for the PCRs were obtained from brain, liver, 
lung, bone marrow, heart, skeletal muscle, and testis and 
reactions were carried out according to the following 
schedule: 94°C, 20 seconds; 68°C 20 seconds; 72°C, 60 

10 seconds, for 35 cycles) . PCR products were separated on an 
agarose gel and visualized with a Typhoon™ fluorimager and 
ImageQuant™ software (Molecular Dynamics, Sunnyvale, CA) . 
RT-PCR product for MDZ3 was produced from brain, testis, 
heart and bone marrow, but not from lung, liver, or skeletal 

15 muscle ( FIG . 14) . 

EXAMPLE 6 

RT-PCR analysis of expression of human MDZ7 

20 RT-PCR analysis was used to determine the 

expression pattern of the human MDZ7 gene. A forward primer 
( 5 f -TCAGATCTGTCGCTCCTTCA) [SEQ ID NO: 7045] and a reverse 
primer (5' -GCAGTCTGAGCACGCGTAAG) [SEQ ID NO: 7 046] derived 
from the open reading frame of MDZ7 were used in standard 

25 RT-PCRs (Sambrook et al . , Molecular cloning: 3 rd edition, 
2001) . Templates for the PCRs were obtained from whole 
fetus, liver, lung, kidney, heart, and testis and reactions 
were carried out according to the following schedule: 94°C, 
20 seconds; 68°C 20 seconds; 72°C, 60 seconds, for 35 

30 cycles) . PCR products were separated on an agarose gel and 
visualized with a Typhoon™ fluorimager and ImageQuant™ 
software (Molecular Dynamics, Sunnyvale, CA) . RT-PCR 
product for MDZ7 was only produced from testis (FIG, 15) . 
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EXAMPLE 7 
RT-PCR analysis of MDZ12 expression 

5 To explore the potential function of the MDZ12 

gene, the expression of MDZ12 gene in human tissues was 
examined by PCR using marathon-ready cDNAs . Oligonucleotides 
OL612 and OL638 were used to amplify both MDZ12a and MDZ12b 
from human cDNAs of bone marrow, brain, heart, kidney, 

10 liver, placenta, skeletal muscle, testis and Hela cells. The 
PCR conditions were according to a touchdown PCR procedure. 
The tubes containing the oligonucleotides, cDNA and Taq 
polymerase were first incubated at 94°C for 15 seconds 
followed by 72°C for 2 minutes, cycle 5 times. The tubes 

15 were then incubated at 94°C for 15 seconds followed by 70°C 
for 2 minutes, cycle 5 times. Finally the tubes were 
incubated at 94°C for 15 seconds followed by 68°C for 2 
minutes, cycle 25 times. The result of the expression 
profile is shown in figure 16. The abundance of PCR product 

20 indicates that MDZ12a is expressed in all tissue examined 
with highest expression in brain, heart, skeletal muscle, 
testis and Hela cells (Figure 16) . MDZ12b, however, is 
expressed with lower to much lower abundance compared with 
MDZ12a in bone marrow, brain, heart, kidney, placenta, 

25 skeletal muscle, testis and Hela cells with almost no 
expression in liver. 



30 



EXAMPLE 8 
Preparation and Labeling of Useful 
Fragments of MDZ3, MDZ4, MDZ7 and MDZ12 



- 196 - 



10 



Useful fragments of MDZ3, MDZ4, MDZ7 and MDZ12 are 
produced by PCR, using standard techniques, or solid phase 
chemical synthesis using an automated nucleic acid 
synthesizer. Each fragment is sequenced, confirming the 
exact chemical structure thereof. 

The exact chemical structure of preferred 
fragments is provided in the attached SEQUENCE LISTING, the 
disclosure of which is incorporated herein by reference in 
its entirety. The following summary identifies the 
fragments whose structures are more fully described in the 
SEQUENCE LISTING: 



SEQ ID NO: 1 (nt, full length MDZ3 cDNA) 

SEQ ID NO: 2 (nt, cDNA ORF of MDZ3) 

15 SEQ ID NO: 3 (aa, full length protein of MDZ3) 

SEQ ID NO: 4 (nt, nt 201 - 1721 of MDZ3) 

SEQ ID NO: 5 (nt, 5' UT portion of SEQ ID NO: 4) 

SEQ ID NO: 6 (nt, coding region of SEQ ID NO: 4) 

SEQ ID NO: 7 (aa, aa 1 - 470, CDS entirely within SEQ 

2 0 ID NO: 6) 

SEQ ID NOs: 8-15 (nt, exon 1-8 from genomic sequence of 

MDZ3 ) 

SEQ ID NOs: 16 - 23 (nt, 500 bp genomic amplicon centered 

about exons 1 - 8 of MDZ3) 
25 SEQ ID NO: 24 (nt, 1000 bp putative promoter of MDZ3 ) 

SEQ ID NOs: 25-1529 (nt, 17-mers 
scanning nt 201 - 1721 of human MDZ3) 
SEQ ID NOs: 1530 - 3026 

(nt, 25-mers scanning nt 201 - 1721 of 
30 human MDZ3) 

SEQ ID NO: 3027 (nt, full length MDZ4 cDNA) 

SEQ ID NO: 3028 (nt, cDNA ORF of MDZ4) 

SEQ ID NO: 3029 (aa, full length protein of MDZ4 ) 
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SEQ ID NO: 3030 
SEQ ID NO: 3031 
SEQ ID NO: 3032 
SEQ ID NO: 3033 

5 

SEQ ID NO: 3034 
SEQ ID NO: 3035 
SEQ ID NO: 3036 
SEQ ID NO: 3037 

10 

SEQ ID NOs: 3038 
SEQ ID NOs: 3042 

15 

SEQ ID NO: 3046 
SEQ ID NOs: 3047 

20 

SEQ ID NOs: 3516 
SEQ ID NOs: 3977 

25 

SEQ ID NOs: 4196 

30 SEQ ID NO: 4407 

' SEQ ID NO: 4408 

SEQ ID NO: 4409 

SEQ ID NO: 4410 



(nt, nt 65 - 549, portion of MDZ4) 
(nt, 5' UT portion of SEQ ID NO: 3030) 
(nt, coding region of SEQ ID NO: 3030) 
(aa, aa 1 - 136, CDS entirely within SEQ 
ID NO: 3032) 

(nt, nt 1095 - 1329, portion of MDZ4) 
(nt, coding region of SEQ ID NO: 3034) 
(nt, 3' UT portion of SEQ ID NO: 3034) 
(aa, aa 319 - 389, CDS entirely within 
SEQ ID NO: 3035) 
3041 

(nt, exons 1-4, from genomic sequence, 
of MDZ4) 
3045 

(nt, 500 bp genomic amplicon centered 
about exons 1 - 4 of MDZ4) 
(nt, 1000 bp putative promoter of MDZ4) 
3515 

(nt, 17-mers scanning nt 65 - 549 of 
human MDZ4) 
3976 

(nt, 25-mers scanning nt 65 - 549 of 
human MDZ4) 
4195 

(nt, 17-mers scanning nt 1095 - 1329 of 
human MDZ4) 
4406 

(nt, 25-mers scanning nt 1095 - 1329 of 
human MDZ4) 

(nt, full length MDZ7 cDNA) 

(nt, cDNA ORF of MDZ7) 

(aa, full length protein of MDZ7) 

(nt, nt 1 - 395, portion of MDZ7) 



- 198 - 



SEQ ID NO: 4411 
SEQ ID NOs: 4412 



SEQ ID NOs: 4416 

SEQ ID NO: 4420 
SEQ ID NOs: 4421 



SEQ ID NOs: 4800 



SEQ ID NOs: 5171 



SEQ ID NOs: 5475 



SEQ ID NO: 5770 
SEQ ID NO: 5771 
SEQ ID NO: 5772 
SEQ ID NO: 5773 
SEQ ID NO: 5774 
SEQ ID NOs: 5775 



SEQ ID NOs: 5779 



SEQ ID NO: 5783 



(nt, nt 1459 - 1778, portion of MDZ7 ) 
4415 

(nt, exons 1-4, from genomic sequence, 
of MDZ7) 
4419 

(nt, 500 bp genomic amplicon centered 
about exon 1 - 4 of MDZ7) 
(nt, 1000 bp putative promoter of MDZ7) 
4799 

(nt, 17-mers scanning nt 1 - 395 of 
human MDZ7) 
5170 

(nt, 25-mers scanning nt 1 - 395 of 
human MDZ7) 
5474 

(nt, 17-mers scanning nt 1459 - 1778 of 
human MDZ7) 
5769 

(nt, 25-mers scanning nt 1459 - 1778 of 
human MDZ7) 

(nt, full length MDZ12 cDNA) 
(nt, cDNA ORF of MDZ12) 
(aa, full length protein of MDZ12) 
(nt, nt 352 - 948, portion of MDZ12) 
(aa, aa 76 - 274, portion of MDZ12) 
5778 

(nt, exons 1-4, from genomic sequence, 
of MDZ12) 
5782 

(nt, 500 bp genomic amplicon centered 
about exon 1 - 4 of MDZ12) 
(nt, 1000 bp putative promoter of MDZ12) 
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SEQ ID NOs: 5784 



SEQ ID NOs: 6365 

5 

SEQ ID NO: 6938 
SEQ ID NO: 6939 
SEQ ID NO: 6940 
10 SEQ ID NO: 6941 

SEQ ID NO: 6942 

SEQ ID NO: 6943 

15 

SEQ ID NOs: 6944 
SEQ ID NOs: 6994 

20 

SEQ ID NO: 7036 
SEQ ID NO: 7037 
SEQ ID NO: 7038 

25 SEQ ID NO: 7039 
SEQ ID NO: 7040 
SEQ ID NO: 7041 
SEQ ID NO: 7042 
SEQ ID NO: 7043 

30 SEQ ID NO: 7044 
SEQ ID NO: 7045 
SEQ ID NO: 7046 



6364 

(nt, 17-mers scanning nt 352 - 948 of 
human MDZ12) 
6937 

(nt, 25-mers scanning nt 352 - 948 of 
human MDZ12 ) 
(nt, cDNA of MDZ12b) 

(aa, full length protein of MDZ12bS) 
(aa, full length protein of MDZ12bL) 
(nt, novel exon (nt 105 - 170 portion) 
of MDZ12b) 

(aa, novel exon (before stop codon) of 
MDZ12b) 

(nt, 500 bp genomic amplicon centered 
about novel exon of MDZ12b) 
6993 

(nt, 17-mers scanning novel exon of 
human MDZ12b) 
7035 

(nt, 25-mers scanning novel exon of 
human MDZ12b) 
(nt, oligo OL612) 
(nt, oligo OL613) 
(nt, oligo OL614) 
(nt, oligo OL615) 
(nt, oligo OL311) 
(nt, oligo OL312) 
(nt, oligo OL638) 
(nt, forward primer 
(nt, reverse primer 
(nt, forward primer 
(nt, reverse primer 



for MDZ3 RT_PCR) 

for MDZ3 RT_PCR) 

for MDZ7 RT_PCR) 

for MDZ7 RT PCR) 
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Upon confirmation of the exact structure, each of 
the above-described nucleic acids of confirmed structure is 
recognized to be immediately useful as a MDZ3, MDZ4, MDZ7 or 
MDZ12-specific probe. 

For use as labeled nucleic acid probes, the above- 
described MDZ3, MDZ4, MDZ7 or MDZ12 nucleic acids are 
separately labeled by random priming. As is well known in 
the art of molecular biology, random priming places the 
investigator in possession of a near-complete set of labeled 
fragments of the template of varying length and varying 
starting nucleotide. 

The labeled probes are used to identify the MDZ3, 
MDZ4, MDZ7 or MDZ12 gene on a Southern blot, and are used to 
measure expression of MDZ3, MDZ4 , MDZ7 or MDZ12 mRNA on a 
15 northern blot and by RT-PCR, using standard techniques. 

EXAMPLE 9 

Production of MDZ3, MDZ4, MDZ7 or MDZ12 Protein 

20 Tne f ull length MDZ3, MDZ4 , MDZ7, MDZ12a or MDZ12b 

cDNA clone is cloned into the mammalian expression vector 
pcDNA3 . 1/HISA (Invitrogen, Carlsbad, CA, USA), transfected 
into COS7 cells, transf ectants selected with G418, and 
protein expression in transf ectants confirmed by detection 
25 of the anti-Xpress™ epitope according to manufacturer's 

instructions. Protein is purified using immobilized metal 
affinity chromatography and vector-encoded protein sequence 
is then removed with enterokinase, per manufacturer's 
instructions, followed by gel filtration and/or HPLC . 

Following epitope tag removal, MDZ3, MDZ4, MDZ7, 
MDZ12a, MDZ12bS or MDZ12bL protein is present at a 
concentration of at least 70%, measured on a weight basis 
with respect to total protein (i.e., w/w) , and is free of 



30 
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acrylamide monomers, bis acrylamide monomers, polyacrylamide 
and ampholytes. Further HPLC purification provides MDZ3, 
MDZ4, MDZ7 , MDZ12a, MDZ12bS or MDZ12bL protein at a 
concentration of at least 95%, measured on a weight basis 
5 with respect to total protein (i.e., w/w) . 



EXAMPLE 10 
Production of Anti- 
MDZ3, MDZ4, MDZ7 or MDZ12 Antibody 

10 

Purified proteins prepared as in Example 6 are 
separately conjugated to carrier proteins and used to 
prepare murine monoclonal antibodies by standard techniques. 
Initial screening with the unconjugated purified proteins, 
15 followed by competitive inhibition screening using peptide 
fragments of the MDZ3> MDZ4, MDZ7, MDZ12a, MDZ12bS or 
MDZ12bL, identifies monoclonal antibodies with specificity 
for MDZ3, MDZ4, MDZ7, MDZ12a, MDZ12bS or MDZ12bL. 
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EXAMPLE 11 
Use of MDZ3, MDZ4, MDZ7 or MDZ12 Probes 
and Antibodies for Diagnosis 

5 After informed consent is obtained, samples are 

drawn from tumors, and tested for MDZ3, MDZ4, MDZ7, MDZ12a 
or MDZ12b mRNA levels by standard techniques and tested 
additionally for MDZ3, MDZ4, MDZ7, MDZ12a or MDZ12b protein 
levels using anti- MDZ3, MDZ4, MDZ7, MDZ12a or MDZ12b 
10 antibodies in a standard ELISA. After data are unblinded, a 
statistically significant correlation of aberrant expression 
of each of the above-described genes is seen with various 
tumor types . 

15 EXAMPLE 12 

Use of MDZ3, MDZ4, MDZ7 or MDZ12 Nucleic Acids, Proteins, 

and Antibodies in Therapy 

Once mutations of MDZ3, MDZ4, MDZ7 or MDZ12 have 
20 been detected in patients, normal MDZ3, MDZ4, MDZ7 or MDZ12 
is reintroduced into the patient's tumor cells by 
introduction of expression vectors that drive MDZ3, MDZ4, 
MDZ7 or MDZ12 expression or by introducing MDZ3, MDZ4, MDZ7 
or MDZ12 proteins into cells, with statistically significant 
25 improvement in patient longevity. 

In patients in whom expression is increased, 
antibodies for the mutated forms of MDZ3, MDZ4, MDZ7 or 
MDZ12 are used to block the function of the abnormal forms 
of the protein. 



EXAMPLE 13 
MDZ3 Disease Associations 
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Diseases that map to the MDZ3 chromosomal region 
are shown in Table 6. Mutation of the MDZ3 gene contributes 
to one or more of these conditions. 



: . v :/ " : : . ' ' Tsble £' ; . . : : 

# di^e^ae3:' : :iaapp^d = .to- :: hiam& : ri. : : chr^mo^om^ ; ; -7q2^^==l= |MDZ3t ^ar^gi-on}:'-; 


mim num 


disease 


chromosomal location 


209850 


Autism, susceptibility to 


7q 


145290 


Hyper re f lexia 


7q 


603511 


Limb-girdle muscular 
dystrophy-ID 


7q 



5 



EXAMPLE 14 
MDZ4 Disease Associations 

10 

Diseases that map to the MDZ4 chromosomal region 
are shown in Table 7. Mutation of the MDZ4 gene contributes 
to one or more of these conditions. 



: . ' ■ ■ : : " : : •» ''Table; 7 ■ : - : ' •••••••• ; ; 

. Sjiseasss mapped to human .<&£$$m^ 


mim_num 


disease 


chromosomal location 


106300 


Ankylosing spondylitis 


6p21.3 


108800 


Atrial septal defect, secundum 
type 


6p21.3 


146520 


Hypotrichosis simplex of scalp 


6p21.3 


222100 


Insulin-dependent diabetes 
mellitus-1 


6p21.3 


137100 


Immunoglobulin A deficiency 


6p21.3 


146850 


Immune suppression to 
streptococcal antigen 


6p21.3 


604809 


Panbronchiolitis, diffuse 


6p21.3 
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167250 


Paget disease of bone 


6p21.3 


177900 


Psoriasis susceptibility 1 


6p21. 3 


179450 


Ragweed sensitivity 


6p21.3 


150270 


Laryngeal adductor paralysis 


6p21.3 - p21.2 



EXAMPLE 15 
MDZ7 Disease Associations 

5 

Diseases that map to the MDZ7 chromosomal region 
are shown in Table 8. Mutation of the MDZ7 gene contributes 
to one or more of these conditions. 



....-:.:.=.... :.-«.:..• . Tabl^ ; & '}•• U. ..- • "j : :. : j : ;..=: - ... 


mim_num 


disease 


chromosomal location 


157700 


Mitral valve prolapse, familial 


16pl2.1-pll.2 


602066 


Convulsions, infantile and 
paroxysmal choreoathetosis 


16pl2-ql2 


266600 


Inflammatory bowel disease-1 


16pl2-ql3 


186580 


Arthrocutaneouveal 
granulomatosis 


16pl2-q21 


128200 


Paroxysmal kinesigenic 
choreoathetosis 


16pll.2-ql2.1 



10 

EXAMPLE 16 
MDZ12 Disease Associations 

15 Diseases that map to the MDZ12 chromosomal region 

are shown in Table 9. Mutation of the MDZ12 gene contributes 
to one or more of these conditions. 
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peel- :t;<5 1 ^^^c^sc^^:|4.^^ £ *■ h^M 




mim num 


disease 


chromosomal location 


604329 


Hypertension, essential, 
susceptibility to, 2 


15q 


214900 


Cholestasis-lymphedema syndrome 


15q 


604416 


Pyogenic sterile arthritis, 
pyoderma gangrenosum, and acne 
{ PAPAsyndrome ) 


15q24-q26. 1 


603813 


Hypercholesterolemia, familial, 
autosomal recessive, 1 


15q25-q26 


600318 


Insulin-dependent diabetes 
mellitus-3 


15q26 


166800 


Otosclerosis 1 


15q26. 1-qter 



All patents, patent publications, and other 
5 published references mentioned herein are hereby 

incorporated by reference in their entireties as if each had 
been individually and specifically incorporated by reference 
herein. While preferred illustrative embodiments of the 
present invention are described, one skilled in the art will 
10 appreciate that the present invention can be practiced by 
other than the described embodiments, which are presented 
for purposes of illustration only and not by way of 
limitation. The present invention is limited only by the 
claims that follow. 
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